AI & ML New Capability

AcceRL introduces a fully asynchronous, decoupled RL framework for Vision-Language-Action (VLA) models that integrates a plug-and-play world model.

arXiv · March 20, 2026 · 2603.18464

Chengxuan Lu, Shukuan Wang, Yanjie Li, Wei Liu, Shiji Jin, Fuyuan Qian, Peiming Li, Baigui Sun, Yang Liu

The Takeaway

It breaks the synchronization bottleneck in large-scale robot training, achieving super-linear scaling in throughput. The addition of a world model within the asynchronous pipeline allows for virtual experience generation, drastically improving sample efficiency for complex physical control.

From the abstract

Reinforcement learning (RL) for large-scale Vision-Language-Action (VLA) models faces significant challenges in computational efficiency and data acquisition. We propose AcceRL, a fully asynchronous and decoupled RL framework designed to eliminate synchronization barriers by physically isolating training, inference, and rollouts. Crucially, AcceRL is the first to integrate a plug-and-play, trainable world model into a distributed asynchronous RL pipeline to generate virtual experiences. Experime

Read the original paper →

← Back to today's papers