AI & ML Scaling Insight

Demonstrates that massive scaling of diverse simulator resets can replace manual curriculum engineering for complex dexterous manipulation.

arXiv · March 18, 2026 · 2603.15789

Patrick Yin, Tyler Westenbroek, Zhengyu Zhang, Joshua Tran, Ignacio Dagnino, Eeshani Shilamkar, Numfor Mbiziwo-Tiapo, Simran Bagaria, Xinlei Liu, Galen Mullins, Andrey Kolobov, Abhishek Gupta

The Takeaway

It shows that long-horizon, contact-rich robot tasks can be solved without human demonstrations or complex reward shaping. This shifts the focus from task-specific engineering to programmatic data coverage in simulation.

From the abstract

Reinforcement learning in massively parallel physics simulations has driven major progress in sim-to-real robot learning. However, current approaches remain brittle and task-specific, relying on extensive per-task engineering to design rewards, curricula, and demonstrations. Even with this engineering, they often fail on long-horizon, contact-rich manipulation tasks and do not meaningfully scale with compute, as performance quickly saturates when training revisits the same narrow regions of stat