AI & ML Scaling Insight

Identifies the 'Caterpillar Tree' as the theoretically optimal structure for test-time computation and backtracking in LLMs.

March 25, 2026

Original Paper

Caterpillar of Thoughts: The Optimal Test-Time Algorithm for Large Language Models

Amir Azarmehr, Soheil Behnezhad, Alma Ghafari

arXiv · 2603.22784

The Takeaway

It provides a theoretical foundation for how much backtracking is needed to maximize model performance under a fixed compute budget. This simplifies the search space for 'Reasoning' models (like o1) by proving that complex branching is often less efficient than a specific caterpillar-style path.

From the abstract

Large language models (LLMs) can often produce substantially better outputs when allowed to use additional test-time computation, such as sampling, chain of thought, backtracking, or revising partial solutions. Despite the growing empirical success of such techniques, there is limited theoretical understanding of how inference time computation should be structured, or what constitutes an optimal use of a fixed computation budget.We model test-time computation as an algorithm interacting with a M