AI & ML Open Release

An open foundation model for humanoid robots that achieves high performance using only 30 hours of real-world robot data by pre-training on egocentric human videos.

arXiv · March 13, 2026 · 2603.12263

Songlin Wei, Hongyi Jing, Boqian Li, Zhenyu Zhao, Jiageng Mao, Zhenhao Ni, Sicheng He, Jie Liu, Xiawei Liu, Kaidi Kang, Sheng Zang, Weiduo Yuan, Marco Pavone, Di Huang, Yue Wang

Why it matters

It addresses the data scarcity in robotics by decoupling visual-action representation learning (from human video) from precise joint control (from robot data). The release of an open foundation model specifically for humanoid loco-manipulation lowers the barrier for researchers in embodied AI.

From the abstract

We introduce $\Psi_0$ (Psi-Zero), an open foundation model to address challenging humanoid loco-manipulation tasks. While existing approaches often attempt to address this fundamental problem by co-training on large and diverse human and humanoid data, we argue that this strategy is suboptimal due to the fundamental kinematic and motion disparities between humans and humanoid robots. Therefore, data efficiency and model performance remain unsatisfactory despite the considerable data volume. To a