:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Qisen, Zhao, Yifan, Li, Jia
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.11845
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ChronosObserver: Taming 4D World with Hyperspace Diffusion Sampling
by: Wang, Qisen, et al.
Published: (2025)

GFlow: Recovering 4D World from Monocular Video
by: Wang, Shizun, et al.
Published: (2024)

How to Use Diffusion Priors under Sparse Views?
by: Wang, Qisen, et al.
Published: (2024)

4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos
by: Xu, Zhen, et al.
Published: (2025)

NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
by: Yang, Yuxue, et al.
Published: (2026)

StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation
by: Xing, Ke, et al.
Published: (2025)

Towards Spatio-Temporal World Scene Graph Generation from Monocular Videos
by: Peddi, Rohith, et al.
Published: (2026)

TeleWorld: Towards Dynamic Multimodal Synthesis with a 4D World Model
by: Chen, Yabo, et al.
Published: (2025)

TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels
by: Lu, Jiahao, et al.
Published: (2025)

MagicWorld: Towards Long-Horizon Stability for Interactive Video World Exploration
by: Li, Guangyuan, et al.
Published: (2025)

DeepVerse: 4D Autoregressive Video Generation as a World Model
by: Chen, Junyi, et al.
Published: (2025)

WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning
by: Zhang, Yuanhan, et al.
Published: (2024)

Towards Visual Query Localization in the 3D World
by: Peng, Liang, et al.
Published: (2026)

WorldSimBench: Towards Video Generation Models as World Simulators
by: Qin, Yiran, et al.
Published: (2024)

VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward
by: An, Zhaochong, et al.
Published: (2026)

VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control
by: Zheng, Sixiao, et al.
Published: (2026)

LivingWorld: Interactive 4D World Generation with Environmental Dynamics
by: Mun, Hyeongju, et al.
Published: (2026)

WEDepth: Efficient Adaptation of World Knowledge for Monocular Depth Estimation
by: Wang, Gongshu, et al.
Published: (2025)

DSG-World: Learning a 3D Gaussian World Model from Dual State Videos
by: Hu, Wenhao, et al.
Published: (2025)

Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting
by: Huang, Jiaxin, et al.
Published: (2025)

MorpheuS: Neural Dynamic 360° Surface Reconstruction from Monocular RGB-D Video
by: Wang, Hengyi, et al.
Published: (2023)

AR4D: Autoregressive 4D Generation from Monocular Videos
by: Zhu, Hanxin, et al.
Published: (2025)

GenieDrive: Towards Physics-Aware Driving World Model with 4D Occupancy Guided Video Generation
by: Yang, Zhenya, et al.
Published: (2025)

Gaussian Sequences with Multi-Scale Dynamics for 4D Reconstruction from Monocular Casual Videos
by: Li, Can, et al.
Published: (2026)

OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
by: Zhou, Yang, et al.
Published: (2025)

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
by: Wang, Xiaofeng, et al.
Published: (2024)

DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
by: Chu, Wen-Hsuan, et al.
Published: (2024)

Predicting 4D Hand Trajectory from Monocular Videos
by: Ye, Yufei, et al.
Published: (2025)

EA3D: Online Open-World 3D Object Extraction from Streaming Videos
by: Zhou, Xiaoyu, et al.
Published: (2025)

IRIS: A Real-World Benchmark for Inverse Recovery and Identification of Physical Dynamic Systems from Monocular Video
by: Khanbayov, Rasul, et al.
Published: (2026)

WorldMirror: Universal 3D World Reconstruction with Any-Prior Prompting
by: Liu, Yifan, et al.
Published: (2025)

Mesh4D: 4D Mesh Reconstruction and Tracking from Monocular Video
by: Jiang, Zeren, et al.
Published: (2026)

Realistic Surgical Simulation from Monocular Videos
by: Wang, Kailing, et al.
Published: (2024)

HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular Videos
by: Xiong, Weitao, et al.
Published: (2025)

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation
by: Wang, Weijie, et al.
Published: (2026)

LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models
by: Duan, Zicheng, et al.
Published: (2026)

StableWorld: Towards Stable and Consistent Long Interactive Video Generation
by: Yang, Ying, et al.
Published: (2026)

LongDPM: Overlap-Aware 4D Reconstruction from Long Monocular Videos
by: Xu, Chenyi, et al.
Published: (2026)

Diffusion Priors for Dynamic View Synthesis from Monocular Videos
by: Wang, Chaoyang, et al.
Published: (2024)

Towards 3D Objectness Learning in an Open World
by: Liu, Taichi, et al.
Published: (2025)