CircuitProbe identifies reasoning circuits in Transformers 1000x faster than brute-force methods and predicts the efficacy of layer duplication.
April 2, 2026
Original Paper
CircuitProbe: Predicting Reasoning Circuits in Transformers via Stability Zone Detection
arXiv · 2604.00716
The Takeaway
It allows developers to optimize small language models (SLMs) in minutes on a CPU by identifying specific layers that benefit from duplication. The finding that layer duplication works for models <3B but fails for >7B provides a practical rule of thumb for model scaling.
From the abstract
Transformer language models contain localized reasoning circuits, contiguous layer blocks that improve reasoning when duplicated at inference time. Finding these circuits currently requires brute-force sweeps costing 25 GPU hours per model. We propose CircuitProbe, which predicts circuit locations from activation statistics in under 5 minutes on CPU, providing a speedup of three to four orders of magnitude. We find that reasoning circuits come in two types: stability circuits in early layers, de