Depriving an AI of specific information makes it mathematically impossible for the model to lie effectively.
April 23, 2026
Original Paper
The Asymmetric Auditor Protocol (AAP): Architectural Bounding of Deceptive Agency in Long-Horizon Autonomous Systems
SSRN · 6526478
The Takeaway
We do not need to teach AI to be honest if we simply control what it knows. This asymmetric architecture ensures the model lacks the context needed to construct a successful deception. An independent auditor has access to more information than the agent, making any lie easy to catch. It creates a structural trap that prevents rogue behavior before it can even start. This moves AI governance away from psychological alignment and toward architectural design. Safety becomes a game of information theory where the human always holds the winning cards.
From the abstract
Strategic Deception-where an agent conceals misaligned intent-is a critical failure mode for autonomous agency. We propose the Asymmetric Auditor Protocol (AAP), a governance specification that treats agentic risk as a function of State and Permission. We present a three-tier implementation maturity model and formal algorithm for observer logic, demonstrating how architectural enforcement via a partitioned state-tree and information asymmetry provides a verifiable alternative to behavioral align