AI & ML Paradigm Shift

Uses Pearl's do-operator to automatically discover and mask irrelevant state dimensions in Reinforcement Learning.

March 20, 2026

Original Paper

Discovering What You Can Control: Interventional Boundary Discovery for Reinforcement Learning

Jiaxin Liu

arXiv · 2603.18257

The Takeaway

RL models often fail in complex environments due to 'confounded distractors'—features that correlate with rewards but aren't caused by actions. This method uses causal interventions to identify the agent's actual sphere of influence, allowing RL agents to generalize in environments where distractors outnumber relevant features 3:1.

From the abstract

Selecting relevant state dimensions in the presence of confounded distractors is a causal identification problem: observational statistics alone cannot reliably distinguish dimensions that correlate with actions from those that actions cause. We formalize this as discovering the agent's Causal Sphere of Influence and propose Interventional Boundary Discovery IBD, which applies Pearl's do-operator to the agent's own actions and uses two-sample testing to produce an interpretable binary mask over