If an AI thinks too much, it actually gets worse at its job; it turns out the best way for it to work is to barely think at all.
April 3, 2026
Original Paper
Brief Is Better: Non-Monotonic Chain-of-Thought Budget Effects in Function-Calling Language Agents
arXiv · 2604.02155
The Takeaway
While we assume more deliberation is always better, long reasoning chains actually confuse AI when it's trying to perform tasks like calling an API. This proves that for many tasks, the best AI output comes from minimal thinking.
From the abstract
How much should a language agent think before taking action? Chain-of-thought (CoT) reasoning is widely assumed to improve agent performance, but the relationship between reasoning length and accuracy in structured tool-use settings remains poorly understood. We present a systematic study of CoT budget effects on function-calling agents, sweeping six token budgets (0--512) across 200 tasks from the Berkeley Function Calling Leaderboard v3 Multiple benchmark. Our central finding is a striking non