Research notes, plotted.

§ BLOG · 22ENTRIES · NEWEST FIRST

DATETITLECATEGORYTIME

APR 24Block-Diagonal Transformers Win 8 of 17 Downstream Tasksbreakthrough7 min→APR 24The Compositional Advantage: Why Structure Helps Reasoningbreakthrough7 min→APR 24The EML Paradox: Better Language Model, Worse Symbolic Regressionpivot7 min→APR 24What Frozen Probes Reveal About BD4 Representationsbreakthrough6 min→APR 24JEPA Meets Group Theory: Perfect Accuracy on S_4 Compositionbreakthrough8 min→APR 24When Your Results Are Real But Your Sample Size Is Notkill6 min→APR 24Structured Transformers in 2026: Where FFN Research Standsfundamentals8 min→APR 20The Case for Negative Results in ML Researchperspectives5 min→APR 20Conformal Prediction for Neural Networks: A Practical Introductiontutorials10 min→APR 20What Happens Inside a Transformer FFN Layer? A Visual Guidefundamentals8 min→APR 20Representation Theory for Machine Learning Engineersfundamentals12 min→APR 19Block-Diagonal Matrices: Why Sparsity Structure Mattersfundamentals7 min→APR 19Merkle Trees for Neural Network Verificationtutorials8 min→APR 18Multi-Hop Reasoning in Language Models: What Works and What Doesn'tliterature-review9 min→APR 18Scaling Experiments with Minimal Infrastructureresearch-notes6 min→APR 18Why Feed-Forward Layers Matter More Than You Thinkfundamentals6 min→APR 17Equivariant Neural Networks: Beyond Data Augmentationliterature-review10 min→APR 17Statistical Rigor in Small-Scale ML Experimentsresearch-notes7 min→APR 15Group Theory Meets Machine Learning: An Introductionfundamentals7 min→APR 10The Grokking Phenomenon: When Neural Networks Suddenly Generalizefundamentals6 min→APR 05Structured Matrices in Neural Networks: A Surveyliterature-review7 min→MAR 28Towards Verifiable AI: Formal Guarantees for Neural Network Outputsperspectives7 min→

APR 24breakthrough7 min

Block-Diagonal Transformers Win 8 of 17 Downstream Tasks

APR 24breakthrough7 min

The Compositional Advantage: Why Structure Helps Reasoning

APR 24pivot7 min

The EML Paradox: Better Language Model, Worse Symbolic Regression

APR 24breakthrough6 min

What Frozen Probes Reveal About BD4 Representations

APR 24breakthrough8 min

JEPA Meets Group Theory: Perfect Accuracy on S_4 Composition

APR 24kill6 min

When Your Results Are Real But Your Sample Size Is Not

APR 24fundamentals8 min

Structured Transformers in 2026: Where FFN Research Stands

APR 20perspectives5 min

The Case for Negative Results in ML Research

APR 20tutorials10 min

Conformal Prediction for Neural Networks: A Practical Introduction

APR 20fundamentals8 min

What Happens Inside a Transformer FFN Layer? A Visual Guide

APR 20fundamentals12 min

Representation Theory for Machine Learning Engineers

APR 19fundamentals7 min

Block-Diagonal Matrices: Why Sparsity Structure Matters

APR 19tutorials8 min

Merkle Trees for Neural Network Verification

APR 18literature-review9 min

Multi-Hop Reasoning in Language Models: What Works and What Doesn't

APR 18research-notes6 min

Scaling Experiments with Minimal Infrastructure

APR 18fundamentals6 min

Why Feed-Forward Layers Matter More Than You Think

APR 17literature-review10 min

Equivariant Neural Networks: Beyond Data Augmentation

APR 17research-notes7 min

Statistical Rigor in Small-Scale ML Experiments

APR 15fundamentals7 min

Group Theory Meets Machine Learning: An Introduction

APR 10fundamentals6 min

The Grokking Phenomenon: When Neural Networks Suddenly Generalize

APR 05literature-review7 min

Structured Matrices in Neural Networks: A Survey

MAR 28perspectives7 min

Towards Verifiable AI: Formal Guarantees for Neural Network Outputs