AI & ML Paradigm Shift

Edit-As-Act reframes 3D scene editing as a goal-regressive planning problem using symbolic action languages rather than purely generative pixel manipulation.

March 19, 2026

Original Paper

Edit-As-Act: Goal-Regressive Planning for Open-Vocabulary 3D Indoor Scene Editing

Seongrae Noh, SeungWon Seo, Gyeong-Moon Park, HyeongYeop Kang

arXiv · 2603.17583

The Takeaway

By treating edits as sequences of PDDL-inspired actions (support, contact, collision), it ensures physical plausibility and global scene consistency. This moves 3D editing away from 'black box' generation toward interpretable, physically grounded transformations.

From the abstract

Editing a 3D indoor scene from natural language is conceptually straightforward but technically challenging. Existing open-vocabulary systems often regenerate large portions of a scene or rely on image-space edits that disrupt spatial structure, resulting in unintended global changes or physically inconsistent layouts. These limitations stem from treating editing primarily as a generative task. We take a different view. A user instruction defines a desired world state, and editing should be the