A 0.26M parameter model using continuous dynamics outperforms 27M parameter recursive models on complex logic tasks like Sudoku-Extreme.
March 25, 2026
Original Paper
Dynamical Systems Theory Behind a Hierarchical Reasoning Model
arXiv · 2603.22871
The Takeaway
By reformulating discrete reasoning as continuous Neural ODEs with hyperspherical repulsion, this model achieves 100x parameter compression while increasing accuracy. It proves that mathematically rigorous latent dynamics can replace brute-force scaling for algorithmic reasoning.
From the abstract
Current large language models (LLMs) primarily rely on linear sequence generation and massive parameter counts, yet they severely struggle with complex algorithmic reasoning. While recent reasoning architectures, such as the Hierarchical Reasoning Model (HRM) and Tiny Recursive Model (TRM), demonstrate that compact recursive networks can tackle these tasks, their training dynamics often lack rigorous mathematical guarantees, leading to instability and representational collapse. We propose the Co