Demonstrates that visual hierarchies require Lorentzian causal structure rather than Euclidean space.
March 27, 2026
Original Paper
Light Cones For Vision: Simple Causal Priors For Visual Hierarchy
arXiv · 2603.24753
The Takeaway
Switching to Lorentzian light cones yielded a 6x accuracy improvement in object discovery with only 11K parameters, suggesting the field has been using the wrong geometric inductive bias for hierarchical representation.
From the abstract
Standard vision models treat objects as independent points in Euclidean space, unable to capture hierarchical structure like parts within wholes. We introduce Worldline Slot Attention, which models objects as persistent trajectories through spacetime worldlines, where each object has multiple slots at different hierarchy levels sharing the same spatial position but differing in temporal coordinates. This architecture consistently fails without geometric structure: Euclidean worldlines achieve 0.