AI is actually the most confident when it's completely making stuff up.
March 24, 2026
Original Paper
Epistemic Observability in Language Models
arXiv · 2603.20531
AI-generated illustration
The Takeaway
We usually assume that if an AI sounds certain, it is more likely to be right, but this study reveals that across all major AI families, confidence levels actually rise as accuracy falls. The researchers prove that this isn't a software bug but a fundamental 'observational' wall that makes it impossible to catch AI lies just by asking the model how sure it is.
From the abstract
We find that models report highest confidence precisely when they are fabricating. Across four model families (OLMo-3, Llama-3.1, Qwen3, Mistral), self-reported confidence inversely correlates with accuracy, with AUC ranging from 0.28 to 0.36 where 0.5 is random guessing.We prove, under explicit formal assumptions, that this is not a capability gap but an observational one. Under text-only observation, where a supervisor sees only the model's output text, no monitoring system can reliably distin