Achieves an 80% reduction in Chain-of-Thought (CoT) tokens while slightly increasing reasoning accuracy.
March 19, 2026
Original Paper
TRiMS: Real-Time Tracking of Minimal Sufficient Length for Efficient Reasoning via RL
arXiv · 2603.17449
The Takeaway
Introduces the metric of 'Minimal Sufficient Length' and uses GRPO to train models to find the shortest possible correct reasoning path. This drastically reduces inference latency and costs for complex reasoning tasks without the performance degradation typically seen in CoT pruning.
From the abstract
Large language models achieve breakthroughs in complex reasoning via long chain-of-thought sequences. However, this often leads to severe reasoning inflation, causing substantial computational redundancy. To maximize Intelligence per Token, we introduce a theoretical metric, MSL-Minimal Sufficient Length. MSL rigorously characterizes the shortest reasoning length that preserves answer correctness. We provide a recursive definition based on independently sampled sequences and prove the existence