AI & ML Efficiency Breakthrough

Introduces a training-free framework that allows LLM agents to dynamically scale their reasoning depth based on a pre-defined token/tool budget.

arXiv · March 16, 2026 · 2603.12634

Yushu Li, Wenlong Deng, Jiajin Li, Xiaoxiao Li

Why it matters

Most agents either over-consume resources or fail on complex tasks; this budget-conditioned node selection allows agents to pivot from exploration to exploitation as resources deplete, improving reliability without expensive fine-tuning.

From the abstract

Test-time scaling has become a dominant paradigm for improving LLM agent reliability, yet current approaches treat compute as an abundant resource, allowing agents to exhaust token and tool budgets on redundant steps or dead-end trajectories. Existing budget-aware methods either require expensive fine-tuning or rely on coarse, trajectory-level heuristics that cannot intervene mid-execution. We propose the Budget-Aware Value Tree (BAVT), a training-free inference-time framework that models multi-

Read the original paper →

← Back to today's papers