Saved in:
| Main Authors: | Ye, Angen, Wang, Boyuan, Ni, Chaojun, Huang, Guan, Zhao, Guosheng, Li, Hao, Li, Hengtao, Li, Jie, Lv, Jindi, Liu, Jingyu, Cao, Min, Li, Peng, Deng, Qiuping, Mei, Wenjun, Wang, Xiaofeng, Chen, Xinze, Zhou, Xinyu, Wang, Yang, Chang, Yifan, Li, Yifan, Zhou, Yukun, Ye, Yun, Liu, Zhichao, Zhu, Zheng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.17240 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
GigaWorld-0: World Models as Data Engine to Empower Embodied AI
by: GigaWorld Team, et al.
Published: (2025)
by: GigaWorld Team, et al.
Published: (2025)
GigaBrain-0: A World Model-Powered Vision-Language-Action Model
by: GigaBrain Team, et al.
Published: (2025)
by: GigaBrain Team, et al.
Published: (2025)
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning
by: GigaBrain Team, et al.
Published: (2026)
by: GigaBrain Team, et al.
Published: (2026)
ReconPhys: Reconstruct Appearance and Physical Attributes from Single Video
by: Wang, Boyuan, et al.
Published: (2026)
by: Wang, Boyuan, et al.
Published: (2026)
VLA-R1: Enhancing Reasoning in Vision-Language-Action Models
by: Ye, Angen, et al.
Published: (2025)
by: Ye, Angen, et al.
Published: (2025)
EmbodieDreamer: Advancing Real2Sim2Real Transfer for Policy Training via Embodied World Modeling
by: Wang, Boyuan, et al.
Published: (2025)
by: Wang, Boyuan, et al.
Published: (2025)
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation
by: Zhao, Guosheng, et al.
Published: (2024)
by: Zhao, Guosheng, et al.
Published: (2024)
WonderTurbo: Generating Interactive 3D World in 0.72 Seconds
by: Ni, Chaojun, et al.
Published: (2025)
by: Ni, Chaojun, et al.
Published: (2025)
ViVa: A Video-Generative Value Model for Robot Reinforcement Learning
by: Lv, Jindi, et al.
Published: (2026)
by: Lv, Jindi, et al.
Published: (2026)
EMMA: Generalizing Real-World Robot Manipulation via Generative Visual Transfer
by: Dong, Zhehao, et al.
Published: (2025)
by: Dong, Zhehao, et al.
Published: (2025)
HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation
by: Wang, Boyuan, et al.
Published: (2025)
by: Wang, Boyuan, et al.
Published: (2025)
WonderFree: Enhancing Novel View Quality and Cross-View Consistency for 3D Scene Exploration
by: Ni, Chaojun, et al.
Published: (2025)
by: Ni, Chaojun, et al.
Published: (2025)
VAG: Dual-Stream Video-Action Generation for Embodied Data Synthesis
by: Lang, Xiaolei, et al.
Published: (2026)
by: Lang, Xiaolei, et al.
Published: (2026)
GigaVideo-1: Advancing Video Generation via Automatic Feedback with 4 GPU-Hours Fine-Tuning
by: Bao, Xiaoyi, et al.
Published: (2025)
by: Bao, Xiaoyi, et al.
Published: (2025)
ActDistill: General Action-Guided Self-Derived Distillation for Efficient Vision-Language-Action Models
by: Ye, Wencheng, et al.
Published: (2025)
by: Ye, Wencheng, et al.
Published: (2025)
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
by: Wang, Xiaofeng, et al.
Published: (2024)
by: Wang, Xiaofeng, et al.
Published: (2024)
ReconDreamer-RL: Enhancing Reinforcement Learning via Diffusion-based Scene Reconstruction
by: Ni, Chaojun, et al.
Published: (2025)
by: Ni, Chaojun, et al.
Published: (2025)
SwiftVLA: Unlocking Spatiotemporal Dynamics for Lightweight VLA Models at Minimal Overhead
by: Ni, Chaojun, et al.
Published: (2025)
by: Ni, Chaojun, et al.
Published: (2025)
Time-Unified Diffusion Policy with Action Discrimination for Robotic Manipulation
by: Niu, Ye, et al.
Published: (2025)
by: Niu, Ye, et al.
Published: (2025)
DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation
by: Zhao, Guosheng, et al.
Published: (2024)
by: Zhao, Guosheng, et al.
Published: (2024)
WorldTree: Towards 4D Dynamic Worlds from Monocular Video using Tree-Chains
by: Wang, Qisen, et al.
Published: (2026)
by: Wang, Qisen, et al.
Published: (2026)
Can Structured Templates Facilitate LLMs in Tackling Harder Tasks? : An Exploration of Scaling Laws by Difficulty
by: Yang, Zhichao, et al.
Published: (2025)
by: Yang, Zhichao, et al.
Published: (2025)
DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning
by: Zhou, Yang, et al.
Published: (2026)
by: Zhou, Yang, et al.
Published: (2026)
From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction
by: Zhao, Zhida, et al.
Published: (2025)
by: Zhao, Zhida, et al.
Published: (2025)
IC-World: In-Context Generation for Shared World Modeling
by: Wu, Fan, et al.
Published: (2025)
by: Wu, Fan, et al.
Published: (2025)
RotVLA: Rotational Latent Action for Vision-Language-Action Model
by: Li, Qiwei, et al.
Published: (2026)
by: Li, Qiwei, et al.
Published: (2026)
VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators
by: Li, Hengtao, et al.
Published: (2025)
by: Li, Hengtao, et al.
Published: (2025)
World Models as Group Actions
by: Wang, Zijie, et al.
Published: (2026)
by: Wang, Zijie, et al.
Published: (2026)
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
by: Li, Yuhan, et al.
Published: (2025)
by: Li, Yuhan, et al.
Published: (2025)
Co-Evolving Latent Action World Models
by: Wang, Yucen, et al.
Published: (2025)
by: Wang, Yucen, et al.
Published: (2025)
ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration
by: Ni, Chaojun, et al.
Published: (2024)
by: Ni, Chaojun, et al.
Published: (2024)
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration
by: Wang, Boyuan, et al.
Published: (2025)
by: Wang, Boyuan, et al.
Published: (2025)
MimicDreamer: Aligning Human and Robot Demonstrations for Scalable VLA Training
by: Li, Haoyun, et al.
Published: (2025)
by: Li, Haoyun, et al.
Published: (2025)
DriveGen3D: Boosting Feed-Forward Driving Scene Generation with Efficient Video Diffusion
by: Wang, Weijie, et al.
Published: (2025)
by: Wang, Weijie, et al.
Published: (2025)
MAPF-World: Action World Model for Multi-Agent Path Finding
by: Yang, Zhanjiang, et al.
Published: (2025)
by: Yang, Zhanjiang, et al.
Published: (2025)
OA-WAM: Object-Addressable World Action Model for Robust Robot Manipulation
by: Liu, Yushan, et al.
Published: (2026)
by: Liu, Yushan, et al.
Published: (2026)
ForgeVLA: Federated Vision-Language-Action Learning without Language Annotations
by: Zhou, Yuhao, et al.
Published: (2026)
by: Zhou, Yuhao, et al.
Published: (2026)
ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation
by: Zhao, Guosheng, et al.
Published: (2025)
by: Zhao, Guosheng, et al.
Published: (2025)
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising
by: Guo, Jun, et al.
Published: (2026)
by: Guo, Jun, et al.
Published: (2026)
The DAWN of World-Action Interactive Models
by: Lu, Hongbo, et al.
Published: (2026)
by: Lu, Hongbo, et al.
Published: (2026)
Similar Items
-
GigaWorld-0: World Models as Data Engine to Empower Embodied AI
by: GigaWorld Team, et al.
Published: (2025) -
GigaBrain-0: A World Model-Powered Vision-Language-Action Model
by: GigaBrain Team, et al.
Published: (2025) -
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning
by: GigaBrain Team, et al.
Published: (2026) -
ReconPhys: Reconstruct Appearance and Physical Attributes from Single Video
by: Wang, Boyuan, et al.
Published: (2026) -
VLA-R1: Enhancing Reasoning in Vision-Language-Action Models
by: Ye, Angen, et al.
Published: (2025)