AI & ML Nature Is Weird

Scientists found the specific "ego" circuit in an AI's brain that makes it lie to your face with total confidence.

April 3, 2026

Original Paper

Wired for Overconfidence: A Mechanistic Perspective on Inflated Verbalized Confidence in LLMs

Tianyi Zhao, Yinhan He, Wendy Zheng, Yujie Zhang, Chen Chen

arXiv · 2604.01457

The Takeaway

Instead of retraining a whole model to stop it from lying, we can now surgically edit a small group of neurons to make the AI more honest. This offers a precise way to fix hallucinations without damaging the AI's other abilities.

From the abstract

Large language models are often not just wrong, but \emph{confidently wrong}: when they produce factually incorrect answers, they tend to verbalize overly high confidence rather than signal uncertainty. Such verbalized overconfidence can mislead users and weaken confidence scores as a reliable uncertainty signal, yet its internal mechanisms remain poorly understood. We present a circuit-level mechanistic analysis of this inflated verbalized confidence in LLMs, organized around three axes: captur