RoboClaw introduces 'Entangled Action Pairs' to allow robots to autonomously collect data by learning to reset their own environment.
arXiv · March 13, 2026 · 2603.11558
Why it matters
The massive bottleneck in robotic learning is human intervention for resets. By coupling forward manipulation with inverse recovery actions, RoboClaw enables continuous on-policy data acquisition, reducing human effort by 53% and improving long-horizon task success.
From the abstract
Vision-Language-Action (VLA) systems have shown strong potential for language-driven robotic manipulation. However, scaling them to long-horizon tasks remains challenging. Existing pipelines typically separate data collection, policy learning, and deployment, resulting in heavy reliance on manual environment resets and brittle multi-policy execution. We present RoboClaw, an agentic robotics framework that unifies data collection, policy learning, and task execution under a single VLM-driven cont