Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Shi, Liangzhi, Chen, Shuaihang, Gao, Feng, Chen, Yinuo, Chen, Kang, Zhang, Tonghe, Zang, Hongzhi, Zhang, Weinan, Yu, Chao, Wang, Yu
Format:	Preprint
Published:	2026
Subjects:	Robotics
Online Access:	https://arxiv.org/abs/2602.12628
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910043251671040
author	Shi, Liangzhi Chen, Shuaihang Gao, Feng Chen, Yinuo Chen, Kang Zhang, Tonghe Zang, Hongzhi Zhang, Weinan Yu, Chao Wang, Yu
author_facet	Shi, Liangzhi Chen, Shuaihang Gao, Feng Chen, Yinuo Chen, Kang Zhang, Tonghe Zang, Hongzhi Zhang, Weinan Yu, Chao Wang, Yu
contents	Simulation offers a scalable and low-cost way to enrich vision-language-action (VLA) training, reducing reliance on expensive real-robot demonstrations. However, most sim-real co-training methods rely on supervised fine-tuning (SFT), which treats simulation as a static source of demonstrations and does not exploit large-scale closed-loop interaction. Consequently, real-world gains and generalization are often limited. In this paper, we propose an \underline{\textit{RL}}-based sim-real \underline{\textit{Co}}-training \modify{(RL-Co)} framework that leverages interactive simulation while preserving real-world capabilities. Our method follows a generic two-stage design: we first warm-start the policy with SFT on a mixture of real and simulated demonstrations, then fine-tune it with reinforcement learning in simulation while adding an auxiliary supervised loss on real-world data to anchor the policy and mitigate catastrophic forgetting. We evaluate our framework on four real-world tabletop manipulation tasks using two representative VLA architectures, OpenVLA and $π_{0.5}$, and observe consistent improvements over real-only fine-tuning and SFT-based co-training, including +24% real-world success on OpenVLA and +20% on $π_{0.5}$. Beyond higher success rates, RL co-training yields stronger generalization to unseen task variations and substantially improved real-world data efficiency, providing a practical and scalable pathway for leveraging simulation to enhance real-robot deployment.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_12628
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Beyond Imitation: Reinforcement Learning-Based Sim-Real Co-Training for VLA Models Shi, Liangzhi Chen, Shuaihang Gao, Feng Chen, Yinuo Chen, Kang Zhang, Tonghe Zang, Hongzhi Zhang, Weinan Yu, Chao Wang, Yu Robotics Simulation offers a scalable and low-cost way to enrich vision-language-action (VLA) training, reducing reliance on expensive real-robot demonstrations. However, most sim-real co-training methods rely on supervised fine-tuning (SFT), which treats simulation as a static source of demonstrations and does not exploit large-scale closed-loop interaction. Consequently, real-world gains and generalization are often limited. In this paper, we propose an \underline{\textit{RL}}-based sim-real \underline{\textit{Co}}-training \modify{(RL-Co)} framework that leverages interactive simulation while preserving real-world capabilities. Our method follows a generic two-stage design: we first warm-start the policy with SFT on a mixture of real and simulated demonstrations, then fine-tune it with reinforcement learning in simulation while adding an auxiliary supervised loss on real-world data to anchor the policy and mitigate catastrophic forgetting. We evaluate our framework on four real-world tabletop manipulation tasks using two representative VLA architectures, OpenVLA and $π_{0.5}$, and observe consistent improvements over real-only fine-tuning and SFT-based co-training, including +24% real-world success on OpenVLA and +20% on $π_{0.5}$. Beyond higher success rates, RL co-training yields stronger generalization to unseen task variations and substantially improved real-world data efficiency, providing a practical and scalable pathway for leveraging simulation to enhance real-robot deployment.
title	Beyond Imitation: Reinforcement Learning-Based Sim-Real Co-Training for VLA Models
topic	Robotics
url	https://arxiv.org/abs/2602.12628

Similar Items