Refleqt Labs

Refleqt Labs https://reflqt.com/blog/ Research notes, experiment reports, and technical dispatches from Refleqt Labs. en Block-Diagonal Transformers Win 8 of 17 Downstream Tasks https://reflqt.com/blog/bd4-wins-downstream-tasks/ https://reflqt.com/blog/bd4-wins-downstream-tasks/ Fri, 24 Apr 2026 00:00:00 GMT Across 17 downstream tasks at five seeds each, a block-diagonal FFN beat the dense baseline eight times, tied nine times, and lost zero -- with compositional tasks dominating the win column. The Compositional Advantage: Why Structure Helps Reasoning https://reflqt.com/blog/compositional-advantage-structured-ffn/ https://reflqt.com/blog/compositional-advantage-structured-ffn/ Fri, 24 Apr 2026 00:00:00 GMT The three tasks where a block-diagonal FFN beat the dense baseline by the largest margins are all compositional: Boolean SAT, logical reasoning, and multi-hop QA. Here is why that pattern is not an accident. The EML Paradox: Better Language Model, Worse Symbolic Regression https://reflqt.com/blog/eml-paradox-language-vs-symbolic/ https://reflqt.com/blog/eml-paradox-language-vs-symbolic/ Fri, 24 Apr 2026 00:00:00 GMT An EML hybrid FFN beat BD4 on 125M-scale language modelling and failed to recover sin(x) in all 12 symbolic regression configurations we tested -- a clean demonstration that universality and trainability are different things. What Frozen Probes Reveal About BD4 Representations https://reflqt.com/blog/frozen-probes-bd4-representations/ https://reflqt.com/blog/frozen-probes-bd4-representations/ Fri, 24 Apr 2026 00:00:00 GMT A frozen linear probe reads 3.4x more structure out of a trained BD4 transformer than out of its random-init twin -- and the ratio grows monotonically across every checkpoint we tested. JEPA Meets Group Theory: Perfect Accuracy on S_4 Composition https://reflqt.com/blog/jepa-s4-group-composition/ https://reflqt.com/blog/jepa-s4-group-composition/ Fri, 24 Apr 2026 00:00:00 GMT A JEPA predictor with a dense FFN solved S_4 group composition perfectly across three seeds; the same predictor with a block-diagonal FFN collapsed to random guessing. When Your Results Are Real But Your Sample Size Is Not https://reflqt.com/blog/power-analysis-sample-size/ https://reflqt.com/blog/power-analysis-sample-size/ Fri, 24 Apr 2026 00:00:00 GMT A power analysis of our SES-012 benchmark comparisons found that ten out of ten were underpowered at n=100 with a single seed -- and we are publishing the audit before the results it audits. Structured Transformers in 2026: Where FFN Research Stands https://reflqt.com/blog/structured-transformers-landscape-2026/ https://reflqt.com/blog/structured-transformers-landscape-2026/ Fri, 24 Apr 2026 00:00:00 GMT A map of the five live research threads reshaping how we think about the transformer FFN, and a preview of the seven-post series that follows. The Case for Negative Results in ML Research https://reflqt.com/blog/case-for-negative-results/ https://reflqt.com/blog/case-for-negative-results/ Mon, 20 Apr 2026 00:00:00 GMT Most ML papers report only what worked. But the experiments that failed -- the hypotheses that were wrong, the architectures that underperformed -- carry just as much scientific information. Here is why negative results deserve publication, and how to write them up. Conformal Prediction for Neural Networks: A Practical Introduction https://reflqt.com/blog/conformal-prediction-intro/ https://reflqt.com/blog/conformal-prediction-intro/ Mon, 20 Apr 2026 00:00:00 GMT Most neural networks output point predictions with no reliability guarantee. Conformal prediction wraps any model in a calibration layer that produces prediction sets with provable finite-sample coverage -- no distributional assumptions required. Here is how it works and how to implement it. What Happens Inside a Transformer FFN Layer? A Visual Guide https://reflqt.com/blog/inside-transformer-ffn/ https://reflqt.com/blog/inside-transformer-ffn/ Mon, 20 Apr 2026 00:00:00 GMT Step inside a transformer FFN layer and see what actually fires. From neuron activation patterns to the key-value memory hypothesis, this visual walkthrough explains how two matrix multiplications and a nonlinearity encode most of what a language model knows. Representation Theory for Machine Learning Engineers https://reflqt.com/blog/representation-theory-for-ml/ https://reflqt.com/blog/representation-theory-for-ml/ Mon, 20 Apr 2026 00:00:00 GMT Representation theory turns abstract group symmetries into concrete matrices you can compute with. This article walks through the core ideas -- from irreducible representations to the Peter-Weyl theorem -- with worked examples using S_3 and S_4, and shows why these structures keep appearing in neural network weight matrices. Block-Diagonal Matrices: Why Sparsity Structure Matters https://reflqt.com/blog/block-diagonal-matrices/ https://reflqt.com/blog/block-diagonal-matrices/ Sun, 19 Apr 2026 00:00:00 GMT Dense matrices are the default in deep learning, but most of those parameters are redundant. Block-diagonal structure offers a principled middle ground between dense and sparse, with real hardware advantages. Merkle Trees for Neural Network Verification https://reflqt.com/blog/merkle-trees-verification/ https://reflqt.com/blog/merkle-trees-verification/ Sun, 19 Apr 2026 00:00:00 GMT Hash trees offer O(log n) membership proofs that can verify neural network outputs without re-running inference. A practical walkthrough of Merkle proofs, SHA-256 construction, and tamper detection for AI systems. Multi-Hop Reasoning in Language Models: What Works and What Doesn't https://reflqt.com/blog/multi-hop-reasoning/ https://reflqt.com/blog/multi-hop-reasoning/ Sat, 18 Apr 2026 00:00:00 GMT Multi-hop reasoning -- answering questions that require chaining multiple facts -- remains one of the hardest open problems in NLP. We survey the benchmarks, the methods, the shortcuts, and the surprising failures that reveal how far we still have to go. Scaling Experiments with Minimal Infrastructure https://reflqt.com/blog/scaling-experiments-infrastructure/ https://reflqt.com/blog/scaling-experiments-infrastructure/ Sat, 18 Apr 2026 00:00:00 GMT Running 1000+ experiments does not require a cluster or an ML platform. Seed sweeps, auto-commit, systemd, and disciplined checkpointing can take you surprisingly far with a single machine. Why Feed-Forward Layers Matter More Than You Think https://reflqt.com/blog/why-ffn-matters/ https://reflqt.com/blog/why-ffn-matters/ Sat, 18 Apr 2026 00:00:00 GMT Attention gets all the glory, but the FFN sub-layers hold the majority of parameters and do most of the computational heavy lifting. A look at what we actually know about what FFN computes. Equivariant Neural Networks: Beyond Data Augmentation https://reflqt.com/blog/equivariant-neural-networks/ https://reflqt.com/blog/equivariant-neural-networks/ Fri, 17 Apr 2026 00:00:00 GMT Data augmentation handles symmetry by brute force. Equivariant neural networks handle it by design, baking group structure directly into weight-sharing patterns. A tour from G-CNNs to SE(3)-Transformers. Statistical Rigor in Small-Scale ML Experiments https://reflqt.com/blog/statistical-rigor-small-scale/ https://reflqt.com/blog/statistical-rigor-small-scale/ Fri, 17 Apr 2026 00:00:00 GMT Three random seeds and a mean is not a confidence interval. A practical guide to bootstrap CIs, effect sizes, power analysis, and the statistical mistakes that plague ML papers. Group Theory Meets Machine Learning: An Introduction https://reflqt.com/blog/group-theory-meets-ml/ https://reflqt.com/blog/group-theory-meets-ml/ Wed, 15 Apr 2026 00:00:00 GMT Symmetry is one of the most powerful organizing principles in mathematics. A growing body of work shows that encoding symmetry into neural network architectures leads to better generalization, fewer parameters, and more interpretable models. The Grokking Phenomenon: When Neural Networks Suddenly Generalize https://reflqt.com/blog/grokking-phenomenon/ https://reflqt.com/blog/grokking-phenomenon/ Fri, 10 Apr 2026 00:00:00 GMT Train a small network on modular arithmetic long past overfitting, and something unexpected happens: validation accuracy suddenly jumps from chance to near-perfect. This is grokking, and it has changed how we think about generalization. Structured Matrices in Neural Networks: A Survey https://reflqt.com/blog/structured-matrices-survey/ https://reflqt.com/blog/structured-matrices-survey/ Sun, 05 Apr 2026 00:00:00 GMT Dense matrix multiplications dominate the compute cost of modern neural networks. Structured matrices -- block-diagonal, Monarch, Kronecker, low-rank, and sparse -- offer a path to faster, smaller models without sacrificing accuracy. Towards Verifiable AI: Formal Guarantees for Neural Network Outputs https://reflqt.com/blog/towards-verifiable-ai/ https://reflqt.com/blog/towards-verifiable-ai/ Sat, 28 Mar 2026 00:00:00 GMT Neural networks produce impressive outputs, but can we prove they are correct? A survey of formal verification, conformal prediction, and cryptographic proof methods for establishing guarantees on neural network behavior.