AI & ML Paradigm Shift

Using Signal Detection Theory, this work proves that LLM calibration and 'metacognitive efficiency' (knowing what you know) are distinct, dissociable capacities.

March 27, 2026

Original Paper

Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory

Jon-Paul Cacioli

arXiv · 2603.25112

The Takeaway

It introduces the meta-d' metric to the ML field, revealing that two models with the same accuracy can have vastly different abilities to detect their own errors. This changes how we should select and deploy models for high-stakes human-AI collaboration.

From the abstract

Standard evaluation of LLM confidence relies on calibration metrics (ECE, Brier score) that conflate two distinct capacities: how much a model knows (Type-1 sensitivity) and how well it knows what it knows (Type-2 metacognitive sensitivity). We introduce an evaluation framework based on Type-2 Signal Detection Theory that decomposes these capacities using meta-d' and the metacognitive efficiency ratio M-ratio. Applied to four LLMs (Llama-3-8B-Instruct, Mistral-7B-Instruct-v0.3, Llama-3-8B-Base,