Saved in:
| Main Authors: | Wang, Meizhong, Jin, Wanxin, Cao, Kun, Xie, Lihua, Hong, Yiguang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.11021 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
EA-WM: Event-Aware Generative World Model with Structured Kinematic-to-Visual Action Fields
by: Yang, Zhaoyang, et al.
Published: (2026)
by: Yang, Zhaoyang, et al.
Published: (2026)
Driver-WM: A Driver-Centric Traffic-Conditioned Latent World Model for In-Cabin Dynamics Rollout
by: Chi, Haozhuang, et al.
Published: (2026)
by: Chi, Haozhuang, et al.
Published: (2026)
PhysWorld: From Real Videos to World Models of Deformable Objects via Physics-Aware Demonstration Synthesis
by: Yang, Yu, et al.
Published: (2025)
by: Yang, Yu, et al.
Published: (2025)
Robot Learning from a Physical World Model
by: Mao, Jiageng, et al.
Published: (2025)
by: Mao, Jiageng, et al.
Published: (2025)
World Models for Learning Dexterous Hand-Object Interactions from Human Videos
by: Goswami, Raktim Gautam, et al.
Published: (2025)
by: Goswami, Raktim Gautam, et al.
Published: (2025)
World Simulation with Video Foundation Models for Physical AI
by: NVIDIA, et al.
Published: (2025)
by: NVIDIA, et al.
Published: (2025)
Rethinking Video Generation Model for the Embodied World
by: Deng, Yufan, et al.
Published: (2026)
by: Deng, Yufan, et al.
Published: (2026)
Physically Grounded Vision-Language Models for Robotic Manipulation
by: Gao, Jensen, et al.
Published: (2023)
by: Gao, Jensen, et al.
Published: (2023)
H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos
by: Ci, Hai, et al.
Published: (2025)
by: Ci, Hai, et al.
Published: (2025)
DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning
by: Zhou, Yang, et al.
Published: (2026)
by: Zhou, Yang, et al.
Published: (2026)
Ego-Grounding for Personalized Question-Answering in Egocentric Videos
by: Xiao, Junbin, et al.
Published: (2026)
by: Xiao, Junbin, et al.
Published: (2026)
UNIC: Learning Unified Multimodal Extrinsic Contact Estimation
by: Xu, Zhengtong, et al.
Published: (2026)
by: Xu, Zhengtong, et al.
Published: (2026)
ICAT: Incident-Case-Grounded Adaptive Testing for Physical-Risk Prediction in Embodied World Models
by: Lai, Zhenglin, et al.
Published: (2026)
by: Lai, Zhenglin, et al.
Published: (2026)
ContactHandover: Contact-Guided Robot-to-Human Object Handover
by: Wang, Zixi, et al.
Published: (2024)
by: Wang, Zixi, et al.
Published: (2024)
Mirage2Matter: A Physically Grounded Gaussian World Model from Video
by: Gao, Zhengqing, et al.
Published: (2026)
by: Gao, Zhengqing, et al.
Published: (2026)
One-Shot Manipulation Strategy Learning by Making Contact Analogies
by: Liu, Yuyao, et al.
Published: (2024)
by: Liu, Yuyao, et al.
Published: (2024)
DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving
by: Zhou, Yang, et al.
Published: (2026)
by: Zhou, Yang, et al.
Published: (2026)
World Models That Know When They Don't Know - Controllable Video Generation with Calibrated Uncertainty
by: Mei, Zhiting, et al.
Published: (2025)
by: Mei, Zhiting, et al.
Published: (2025)
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising
by: Guo, Jun, et al.
Published: (2026)
by: Guo, Jun, et al.
Published: (2026)
Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
by: Fu, Ao, et al.
Published: (2024)
by: Fu, Ao, et al.
Published: (2024)
Digital Gene: Learning about the Physical World through Analytic Concepts
by: Sun, Jianhua, et al.
Published: (2025)
by: Sun, Jianhua, et al.
Published: (2025)
GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs
by: Xu, Xinli, et al.
Published: (2024)
by: Xu, Xinli, et al.
Published: (2024)
DDP-WM: Disentangled Dynamics Prediction for Efficient World Models
by: Yin, Shicheng, et al.
Published: (2026)
by: Yin, Shicheng, et al.
Published: (2026)
Go-SLAM: Grounded Object Segmentation and Localization with Gaussian Splatting SLAM
by: Pham, Phu, et al.
Published: (2024)
by: Pham, Phu, et al.
Published: (2024)
Grounding Video Models to Actions through Goal Conditioned Exploration
by: Luo, Yunhao, et al.
Published: (2024)
by: Luo, Yunhao, et al.
Published: (2024)
GWM: Towards Scalable Gaussian World Models for Robotic Manipulation
by: Lu, Guanxing, et al.
Published: (2025)
by: Lu, Guanxing, et al.
Published: (2025)
Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals
by: Gillman, Nate, et al.
Published: (2026)
by: Gillman, Nate, et al.
Published: (2026)
Continuous Vision-Language-Action Co-Learning with Semantic-Physical Alignment for Behavioral Cloning
by: Qi, Xiuxiu, et al.
Published: (2025)
by: Qi, Xiuxiu, et al.
Published: (2025)
GrndCtrl: Grounding World Models via Self-Supervised Reward Alignment
by: He, Haoyang, et al.
Published: (2025)
by: He, Haoyang, et al.
Published: (2025)
Chain of World: World Model Thinking in Latent Motion
by: Yang, Fuxiang, et al.
Published: (2026)
by: Yang, Fuxiang, et al.
Published: (2026)
Smart Help: Strategic Opponent Modeling for Proactive and Adaptive Robot Assistance in Households
by: Cao, Zhihao, et al.
Published: (2024)
by: Cao, Zhihao, et al.
Published: (2024)
Multi-Modal World Model for Physical Robot Interactions: Simultaneous Visual and Tactile Predictions for Enhanced Accuracy
by: Mandil, Willow, et al.
Published: (2023)
by: Mandil, Willow, et al.
Published: (2023)
PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation
by: Huang, Wenlong, et al.
Published: (2026)
by: Huang, Wenlong, et al.
Published: (2026)
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence
by: Zeng, Tianle, et al.
Published: (2026)
by: Zeng, Tianle, et al.
Published: (2026)
Phys4D: Fine-Grained Physics-Consistent 4D Modeling from Video Diffusion
by: Lu, Haoran, et al.
Published: (2026)
by: Lu, Haoran, et al.
Published: (2026)
WorldMAP: Bootstrapping Vision-Language Navigation Trajectory Prediction with Generative World Models
by: Chen, Hongjin, et al.
Published: (2026)
by: Chen, Hongjin, et al.
Published: (2026)
Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow
by: Dharmarajan, Karthik, et al.
Published: (2025)
by: Dharmarajan, Karthik, et al.
Published: (2025)
AdaWorld: Learning Adaptable World Models with Latent Actions
by: Gao, Shenyuan, et al.
Published: (2025)
by: Gao, Shenyuan, et al.
Published: (2025)
MMAUD: A Comprehensive Multi-Modal Anti-UAV Dataset for Modern Miniature Drone Threats
by: Yuan, Shenghai, et al.
Published: (2024)
by: Yuan, Shenghai, et al.
Published: (2024)
Learning 3D-Gaussian Simulators from RGB Videos
by: Zhobro, Mikel, et al.
Published: (2025)
by: Zhobro, Mikel, et al.
Published: (2025)
Similar Items
-
EA-WM: Event-Aware Generative World Model with Structured Kinematic-to-Visual Action Fields
by: Yang, Zhaoyang, et al.
Published: (2026) -
Driver-WM: A Driver-Centric Traffic-Conditioned Latent World Model for In-Cabin Dynamics Rollout
by: Chi, Haozhuang, et al.
Published: (2026) -
PhysWorld: From Real Videos to World Models of Deformable Objects via Physics-Aware Demonstration Synthesis
by: Yang, Yu, et al.
Published: (2025) -
Robot Learning from a Physical World Model
by: Mao, Jiageng, et al.
Published: (2025) -
World Models for Learning Dexterous Hand-Object Interactions from Human Videos
by: Goswami, Raktim Gautam, et al.
Published: (2025)