Provides a computationally efficient 'early warning' system for emergent capabilities like grokking and induction head formation using 2-datapoint reduced density matrices.
April 1, 2026
Original Paper
From Density Matrices to Phase Transitions in Deep Learning: Spectral Early Warnings and Interpretability
arXiv · 2603.29805
The Takeaway
This method allows researchers to predict and interpret phase transitions during training without expensive probes. It provides a unified observable from quantum chemistry that identifies exactly when and how a model is reorganizing its internal logic.
From the abstract
A key problem in the modern study of AI is predicting and understanding emergent capabilities in models during training. Inspired by methods for studying reactions in quantum chemistry, we present the ``2-datapoint reduced density matrix". We show that this object provides a computationally efficient, unified observable of phase transitions during training. By tracking the eigenvalue statistics of the 2RDM over a sliding window, we derive two complementary signals: the spectral heat capacity, wh