AI & ML Nature Is Weird

A single model can house an entire boardroom of arguing experts within its internal activation space.

April 29, 2026

Original Paper

Latent Agents: A Post-Training Procedure for Internalized Multi-Agent Debate

John Seon Keun Yi, Aaron Mueller, Dokyun Lee

arXiv · 2604.24881

The Takeaway

Multi-agent debate usually requires running several instances of a model to reach a consensus. This post-training procedure distills those external arguments into agent-specific subspaces within a single neural network. Probing the model reveals distinct directions in its brain that represent different personas or perspectives. The model internally simulates the debate process, providing the accuracy of a team with the speed of an individual. Practitioners can now bypass the high latency of multi-agent systems while keeping the reasoning benefits of diverse viewpoints. This internal debate mechanism significantly improves accuracy without increasing the model's footprint.

From the abstract

Multi-agent debate has been shown to improve reasoning in large language models (LLMs). However, it is compute-intensive, requiring generation of long transcripts before answering questions. To address this inefficiency, we develop a framework that distills multi-agent debate into a single LLM through a two-stage fine-tuning pipeline combining debate structure learning with internalization via dynamic reward scheduling and length clipping. Across multiple models and benchmarks, our internalized