Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yin, Tenny, Mei, Zhiting, Zheng, Zhonghe, Yamane, Miyu, Wang, David, Sceats, Jade, Bateman, Samuel M., Zha, Lihan, Badithela, Apurva, Shorinwa, Ola, Majumdar, Anirudha
Format:	Preprint
Published:	2026
Subjects:	Robotics Artificial Intelligence
Online Access:	https://arxiv.org/abs/2603.09030
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Action-conditioned video models offer a promising path to building general-purpose robot simulators that can improve directly from data. Yet, despite training on large-scale robot datasets, current state-of-the-art video models still struggle to predict physically consistent robot-object interactions that are crucial in robotic manipulation. To close this gap, we present PlayWorld, a simple, scalable, and fully autonomous pipeline for training high-fidelity video world simulators from interaction experience. In contrast to prior approaches that rely on success-biased human demonstrations, PlayWorld is the first system capable of learning entirely from unsupervised robot self-play, enabling naturally scalable data collection while capturing complex, long-tailed physical interactions essential for modeling realistic object dynamics. Experiments across diverse manipulation tasks show that PlayWorld generates high-quality, physically consistent predictions for contact-rich interactions that are not captured by world models trained on human-collected data. We further demonstrate the versatility of PlayWorld in enabling fine-grained failure prediction and policy evaluation, with up to 40% improvements over human-collected data. Finally, we demonstrate how PlayWorld enables reinforcement learning in the world model, improving policy performance by 65% in success rates when deployed in the real world.

Similar Items