Saved in:
| Main Authors: | Wang, Qisen, Zhao, Yifan, Li, Jia |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.11845 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ChronosObserver: Taming 4D World with Hyperspace Diffusion Sampling
by: Wang, Qisen, et al.
Published: (2025)
by: Wang, Qisen, et al.
Published: (2025)
GFlow: Recovering 4D World from Monocular Video
by: Wang, Shizun, et al.
Published: (2024)
by: Wang, Shizun, et al.
Published: (2024)
How to Use Diffusion Priors under Sparse Views?
by: Wang, Qisen, et al.
Published: (2024)
by: Wang, Qisen, et al.
Published: (2024)
4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos
by: Xu, Zhen, et al.
Published: (2025)
by: Xu, Zhen, et al.
Published: (2025)
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
by: Yang, Yuxue, et al.
Published: (2026)
by: Yang, Yuxue, et al.
Published: (2026)
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation
by: Xing, Ke, et al.
Published: (2025)
by: Xing, Ke, et al.
Published: (2025)
Towards Spatio-Temporal World Scene Graph Generation from Monocular Videos
by: Peddi, Rohith, et al.
Published: (2026)
by: Peddi, Rohith, et al.
Published: (2026)
TeleWorld: Towards Dynamic Multimodal Synthesis with a 4D World Model
by: Chen, Yabo, et al.
Published: (2025)
by: Chen, Yabo, et al.
Published: (2025)
TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels
by: Lu, Jiahao, et al.
Published: (2025)
by: Lu, Jiahao, et al.
Published: (2025)
MagicWorld: Towards Long-Horizon Stability for Interactive Video World Exploration
by: Li, Guangyuan, et al.
Published: (2025)
by: Li, Guangyuan, et al.
Published: (2025)
DeepVerse: 4D Autoregressive Video Generation as a World Model
by: Chen, Junyi, et al.
Published: (2025)
by: Chen, Junyi, et al.
Published: (2025)
WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning
by: Zhang, Yuanhan, et al.
Published: (2024)
by: Zhang, Yuanhan, et al.
Published: (2024)
Towards Visual Query Localization in the 3D World
by: Peng, Liang, et al.
Published: (2026)
by: Peng, Liang, et al.
Published: (2026)
WorldSimBench: Towards Video Generation Models as World Simulators
by: Qin, Yiran, et al.
Published: (2024)
by: Qin, Yiran, et al.
Published: (2024)
VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward
by: An, Zhaochong, et al.
Published: (2026)
by: An, Zhaochong, et al.
Published: (2026)
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control
by: Zheng, Sixiao, et al.
Published: (2026)
by: Zheng, Sixiao, et al.
Published: (2026)
LivingWorld: Interactive 4D World Generation with Environmental Dynamics
by: Mun, Hyeongju, et al.
Published: (2026)
by: Mun, Hyeongju, et al.
Published: (2026)
WEDepth: Efficient Adaptation of World Knowledge for Monocular Depth Estimation
by: Wang, Gongshu, et al.
Published: (2025)
by: Wang, Gongshu, et al.
Published: (2025)
DSG-World: Learning a 3D Gaussian World Model from Dual State Videos
by: Hu, Wenhao, et al.
Published: (2025)
by: Hu, Wenhao, et al.
Published: (2025)
Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting
by: Huang, Jiaxin, et al.
Published: (2025)
by: Huang, Jiaxin, et al.
Published: (2025)
MorpheuS: Neural Dynamic 360° Surface Reconstruction from Monocular RGB-D Video
by: Wang, Hengyi, et al.
Published: (2023)
by: Wang, Hengyi, et al.
Published: (2023)
AR4D: Autoregressive 4D Generation from Monocular Videos
by: Zhu, Hanxin, et al.
Published: (2025)
by: Zhu, Hanxin, et al.
Published: (2025)
GenieDrive: Towards Physics-Aware Driving World Model with 4D Occupancy Guided Video Generation
by: Yang, Zhenya, et al.
Published: (2025)
by: Yang, Zhenya, et al.
Published: (2025)
Gaussian Sequences with Multi-Scale Dynamics for 4D Reconstruction from Monocular Casual Videos
by: Li, Can, et al.
Published: (2026)
by: Li, Can, et al.
Published: (2026)
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
by: Zhou, Yang, et al.
Published: (2025)
by: Zhou, Yang, et al.
Published: (2025)
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
by: Wang, Xiaofeng, et al.
Published: (2024)
by: Wang, Xiaofeng, et al.
Published: (2024)
DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
by: Chu, Wen-Hsuan, et al.
Published: (2024)
by: Chu, Wen-Hsuan, et al.
Published: (2024)
Predicting 4D Hand Trajectory from Monocular Videos
by: Ye, Yufei, et al.
Published: (2025)
by: Ye, Yufei, et al.
Published: (2025)
EA3D: Online Open-World 3D Object Extraction from Streaming Videos
by: Zhou, Xiaoyu, et al.
Published: (2025)
by: Zhou, Xiaoyu, et al.
Published: (2025)
IRIS: A Real-World Benchmark for Inverse Recovery and Identification of Physical Dynamic Systems from Monocular Video
by: Khanbayov, Rasul, et al.
Published: (2026)
by: Khanbayov, Rasul, et al.
Published: (2026)
WorldMirror: Universal 3D World Reconstruction with Any-Prior Prompting
by: Liu, Yifan, et al.
Published: (2025)
by: Liu, Yifan, et al.
Published: (2025)
Mesh4D: 4D Mesh Reconstruction and Tracking from Monocular Video
by: Jiang, Zeren, et al.
Published: (2026)
by: Jiang, Zeren, et al.
Published: (2026)
Realistic Surgical Simulation from Monocular Videos
by: Wang, Kailing, et al.
Published: (2024)
by: Wang, Kailing, et al.
Published: (2024)
HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular Videos
by: Xiong, Weitao, et al.
Published: (2025)
by: Xiong, Weitao, et al.
Published: (2025)
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation
by: Wang, Weijie, et al.
Published: (2026)
by: Wang, Weijie, et al.
Published: (2026)
LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models
by: Duan, Zicheng, et al.
Published: (2026)
by: Duan, Zicheng, et al.
Published: (2026)
StableWorld: Towards Stable and Consistent Long Interactive Video Generation
by: Yang, Ying, et al.
Published: (2026)
by: Yang, Ying, et al.
Published: (2026)
LongDPM: Overlap-Aware 4D Reconstruction from Long Monocular Videos
by: Xu, Chenyi, et al.
Published: (2026)
by: Xu, Chenyi, et al.
Published: (2026)
Diffusion Priors for Dynamic View Synthesis from Monocular Videos
by: Wang, Chaoyang, et al.
Published: (2024)
by: Wang, Chaoyang, et al.
Published: (2024)
Towards 3D Objectness Learning in an Open World
by: Liu, Taichi, et al.
Published: (2025)
by: Liu, Taichi, et al.
Published: (2025)
Similar Items
-
ChronosObserver: Taming 4D World with Hyperspace Diffusion Sampling
by: Wang, Qisen, et al.
Published: (2025) -
GFlow: Recovering 4D World from Monocular Video
by: Wang, Shizun, et al.
Published: (2024) -
How to Use Diffusion Priors under Sparse Views?
by: Wang, Qisen, et al.
Published: (2024) -
4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos
by: Xu, Zhen, et al.
Published: (2025) -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
by: Yang, Yuxue, et al.
Published: (2026)