AI & ML New Capability

Cortical Policy introduces a dual-stream view transformer inspired by the human brain's dorsal and ventral pathways to solve complex robotic manipulation.

March 24, 2026

Original Paper

Cortical Policy: A Dual-Stream View Transformer for Robotic Manipulation

Xuening Zhang, Qi Lv, Xiang Deng, Miao Zhang, Xingbo Liu, Liqiang Nie

arXiv · 2603.21051

The Takeaway

By integrating static 3D foundation model features with dynamic egocentric gaze estimation, this architecture significantly outperforms previous SOTA on the COLOSSEUM benchmark. It provides a new template for multi-view robotic perception that handles spatial complexity and dynamic changes simultaneously.

From the abstract

View transformers process multi-view observations to predict actions and have shown impressive performance in robotic manipulation. Existing methods typically extract static visual representations in a view-specific manner, leading to inadequate 3D spatial reasoning ability and a lack of dynamic adaptation. Taking inspiration from how the human brain integrates static and dynamic views to address these challenges, we propose Cortical Policy, a novel dual-stream view transformer for robotic manip