Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.18464 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866908901740380160 |
|---|---|
| author | Lu, Chengxuan Wang, Shukuan Li, Yanjie Liu, Wei Jin, Shiji Qian, Fuyuan Li, Peiming Sun, Baigui Liu, Yang |
| author_facet | Lu, Chengxuan Wang, Shukuan Li, Yanjie Liu, Wei Jin, Shiji Qian, Fuyuan Li, Peiming Sun, Baigui Liu, Yang |
| contents | Reinforcement learning (RL) for large-scale Vision-Language-Action (VLA) models faces significant challenges in computational efficiency and data acquisition. We propose AcceRL, a fully asynchronous and decoupled RL framework designed to eliminate synchronization barriers by physically isolating training, inference, and rollouts. Crucially, AcceRL is the first to integrate a plug-and-play, trainable world model into a distributed asynchronous RL pipeline to generate virtual experiences. Experiments on the LIBERO~\cite{liu2023libero} benchmark demonstrate that AcceRL achieves state-of-the-art (SOTA) performance. Systematically, it exhibits super-linear scaling in throughput and highly efficient hardware utilization. Algorithmically, the world-model-augmented variant delivers unprecedented sample efficiency and robust training stability in complex control tasks. Code is publicly available at https://github.com/distanceLu/AcceRL. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2603_18464 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models Lu, Chengxuan Wang, Shukuan Li, Yanjie Liu, Wei Jin, Shiji Qian, Fuyuan Li, Peiming Sun, Baigui Liu, Yang Machine Learning Reinforcement learning (RL) for large-scale Vision-Language-Action (VLA) models faces significant challenges in computational efficiency and data acquisition. We propose AcceRL, a fully asynchronous and decoupled RL framework designed to eliminate synchronization barriers by physically isolating training, inference, and rollouts. Crucially, AcceRL is the first to integrate a plug-and-play, trainable world model into a distributed asynchronous RL pipeline to generate virtual experiences. Experiments on the LIBERO~\cite{liu2023libero} benchmark demonstrate that AcceRL achieves state-of-the-art (SOTA) performance. Systematically, it exhibits super-linear scaling in throughput and highly efficient hardware utilization. Algorithmically, the world-model-augmented variant delivers unprecedented sample efficiency and robust training stability in complex control tasks. Code is publicly available at https://github.com/distanceLu/AcceRL. |
| title | AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2603.18464 |