AI & ML Scaling Insight

Discovers how uncertainty estimation signals like self-consistency and verbalized confidence scale and complement each other in reasoning models.

March 20, 2026

Original Paper

How Uncertainty Estimation Scales with Sampling in Reasoning Models

Maksym Del, Markus Kängsepp, Marharyta Domnich, Ardi Tampuu, Lisa Yankovskaya, Meelis Kull, Mark Fishel

arXiv · 2603.19118

The Takeaway

Provides a roadmap for deploying reliable reasoning models (like R1 variants) by showing that a hybrid estimator using just two samples can outperform single-signal estimators even at much higher sampling budgets. It characterizes the domain-dependent nature of these signals, specifically highlighting superior scaling in RLVR-trained domains like mathematics.

From the abstract

Uncertainty estimation is critical for deploying reasoning language models, yet remains poorly understood under extended chain-of-thought reasoning. We study parallel sampling as a fully black-box approach using verbalized confidence and self-consistency. Across three reasoning models and 17 tasks spanning mathematics, STEM, and humanities, we characterize how these signals scale.Both self-consistency and verbalized confidence scale in reasoning models, but self-consistency exhibits lower initia