AcceRL introduces a fully asynchronous, decoupled RL framework for Vision-Language-Action (VLA) models that integrates a plug-and-play world model.
arXiv · March 20, 2026 · 2603.18464
The Takeaway
It breaks the synchronization bottleneck in large-scale robot training, achieving super-linear scaling in throughput. The addition of a world model within the asynchronous pipeline allows for virtual experience generation, drastically improving sample efficiency for complex physical control.
From the abstract
Reinforcement learning (RL) for large-scale Vision-Language-Action (VLA) models faces significant challenges in computational efficiency and data acquisition. We propose AcceRL, a fully asynchronous and decoupled RL framework designed to eliminate synchronization barriers by physically isolating training, inference, and rollouts. Crucially, AcceRL is the first to integrate a plug-and-play, trainable world model into a distributed asynchronous RL pipeline to generate virtual experiences. Experime