AI & ML Breaks Assumption

Formalizes the 'Visual Confused Deputy' attack, where agents are tricked into authorizing privileged actions via slight visual screen manipulations.

arXiv · March 17, 2026 · 2603.14707

Xunzhuo Liu, Bowei He, Xue Liu, Andy Luo, Haichen Zhang, Huamin Chen

The Takeaway

Identifies a major security gap in computer-using agents where grounding errors can be exploited to bypass user intent. It introduces a dual-channel guardrail that independently verifies visual targets and textual reasoning to block malicious executions.

From the abstract

Computer-using agents (CUAs) act directly on graphical user interfaces, yet their perception of the screen is often unreliable. Existing work largely treats these failures as performance limitations, asking whether an action succeeds, rather than whether the agent is acting on the correct object at all. We argue that this is fundamentally a security problem. We formalize the visual confused deputy: a failure mode in which an agent authorizes an action based on a misperceived screen state, due to