AI & ML Paradigm Shift

LLMs compute and cache confidence scores automatically during answer generation, well before they are prompted to verbalize them.

March 19, 2026

Original Paper

How do LLMs Compute Verbal Confidence

Dharshan Kumaran, Arthur Conmy, Federico Barbero, Simon Osindero, Viorica Patraucean, Petar Velickovic

arXiv · 2603.17839

The Takeaway

It reveals that verbal confidence is not a post-hoc reconstruction or a simple log-probability readout, but a sophisticated internal self-evaluation stored in specific hidden states. This provides a mechanistic target for improving model calibration and detecting hallucinations through internal probes.

From the abstract

Verbal confidence -- prompting LLMs to state their confidence as a number or category -- is widely used to extract uncertainty estimates from black-box models. However, how LLMs internally generate such scores remains unknown. We address two questions: first, when confidence is computed - just-in-time when requested, or automatically during answer generation and cached for later retrieval; and second, what verbal confidence represents - token log-probabilities, or a richer evaluation of answer q