AI & ML Paradigm Challenge

A small Bayesian engine paired with a simple language parser beats the world's largest LLMs at medical diagnosis for a fraction of the cost.

April 23, 2026

Original Paper

Statistics, Not Scale: Modular Medical Dialogue with Bayesian Belief Engine

Yusuf Kesmen, Fay Elhassan, Jiayi Ma, Julien Stalhandske, David Sasu, Alexandra Kulinkina, Akhil Arora, Lars Klein, Mary-Anne Hartley

arXiv · 2604.20022

The Takeaway

Modular architectures prove that scaling is not the only path to superior medical reasoning. This system uses an LLM only for parsing language while leaving the heavy lifting of diagnosis to a deterministic statistical engine. It overcomes the safety and hallucination risks that plague frontier models in healthcare settings. Current trends favor massive, general-purpose models, but this result shows that rigid statistical rigor is more effective for high-stakes decisions. Hospitals could deploy these efficient, specialized systems on modest hardware without sacrificing patient safety.

From the abstract

Large language models are increasingly deployed as autonomous diagnostic agents, yet they conflate two fundamentally different capabilities: natural-language communication and probabilistic reasoning. We argue that this conflation is an architectural flaw, not an engineering shortcoming. We introduce BMBE (Bayesian Medical Belief Engine), a modular diagnostic dialogue framework that enforces a strict separation between language and reasoning: an LLM serves only as a sensor, parsing patient utter