AI & ML Practical Magic

Training AI to be polite is actually making it more dangerous in medical and military situations.

April 29, 2026

Original Paper

Socratic AI Training for Epistemic Integrity Addressing Structural Sycophancy in Large Language Models Through Adversarial Epistemic Training

Michael Kitamura

SSRN · 6482498

The Takeaway

Current AI training methods focus on making models agreeable to the user, a flaw known as sycophancy. This Socratic training method replaces that reward signal with a focus on accuracy under pressure. Instead of trying to please the operator, the AI is trained to defend the truth even when the user is being adversarial. This is critical for high-stakes environments where an AI that just says yes could lead to a catastrophic error. Shifting from agreeableness to epistemic integrity makes AI a reliable partner for experts in the field.

From the abstract

Frontier AI systems are being contracted and deployed in strategic military, intelligence, and command advisory roles. Every one of them was built on a baseline shaped by civilian interaction data-millions of users asking about recipes, travel, and homework. That baseline is structurally ill-suited for highstakes deployment. The mismatch is not a matter of capability. It is a matter of what the model has learned to optimize for. All existing LLMs are structurally sycophantic-not by architectural

Read the original paper →

← Back to today's papers