AI & ML Breaks Assumption

Listed API prices for reasoning models (RLMs) are shown to be highly misleading, with cheaper models often costing 28x more in practice.

March 26, 2026

Original Paper

The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

Lingjiao Chen, Chi Zhang, Yeye He, Ion Stoica, Matei Zaharia, James Zou

arXiv · 2603.23971

The Takeaway

The study reveals that 'thinking tokens' vary so wildly (up to 900% difference between models) that standard per-token pricing fails as a cost metric. This is a critical finding for any practitioner budgeting or benchmarking frontier models like o1 or Gemini 2.0.

From the abstract

Developers and consumers increasingly choose reasoning language models (RLMs) based on their listed API prices. However, how accurately do these prices reflect actual inference costs? We conduct the first systematic study of this question, evaluating 8 frontier RLMs across 9 diverse tasks covering competition math, science QA, code generation, and multi-domain reasoning. We uncover the pricing reversal phenomenon: in 21.8% of model-pair comparisons, the model with a lower listed price actually i