AI & ML Paradigm Challenge

AI is actually the most confident when it's completely making stuff up.

March 24, 2026

Original Paper

Epistemic Observability in Language Models

Tony Mason

arXiv · 2603.20531

AI-generated illustration

The Takeaway

We usually assume that if an AI sounds certain, it is more likely to be right, but this study reveals that across all major AI families, confidence levels actually rise as accuracy falls. The researchers prove that this isn't a software bug but a fundamental 'observational' wall that makes it impossible to catch AI lies just by asking the model how sure it is.

From the abstract

We find that models report highest confidence precisely when they are fabricating. Across four model families (OLMo-3, Llama-3.1, Qwen3, Mistral), self-reported confidence inversely correlates with accuracy, with AUC ranging from 0.28 to 0.36 where 0.5 is random guessing.We prove, under explicit formal assumptions, that this is not a capability gap but an observational one. Under text-only observation, where a supervisor sees only the model's output text, no monitoring system can reliably distin

Read the original paper →

← Back to today's papers