:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Haiming, Yan, Xu, Xue, Ying, Guo, Zixuan, Cui, Shuguang, Li, Zhen, Liu, Bingbing
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2411.17027
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

An Efficient Occupancy World Model via Decoupled Dynamic Flow and Image-assisted Training
by: Zhang, Haiming, et al.
Published: (2024)

DLWM: Dual Latent World Models enable Holistic Gaussian-centric Pre-training in Autonomous Driving
by: Zhu, Yiyao, et al.
Published: (2026)

SQS: Enhancing Sparse Perception Models via Query-based Splatting in Autonomous Driving
by: Zhang, Haiming, et al.
Published: (2025)

Benchmarking the Robustness of LiDAR Semantic Segmentation Models
by: Yan, Xu, et al.
Published: (2023)

VisionPAD: A Vision-Centric Pre-training Paradigm for Autonomous Driving
by: Zhang, Haiming, et al.
Published: (2024)

WPT: World-to-Policy Transfer via Online World Model Distillation
by: Jiang, Guangfeng, et al.
Published: (2025)

FASTopoWM: Fast-Slow Lane Segment Topology Reasoning with Latent World Models
by: Yang, Yiming, et al.
Published: (2025)

DV-3DLane: End-to-end Multi-modal 3D Lane Detection with Dual-view Representation
by: Luo, Yueru, et al.
Published: (2024)

DriveFlow: Rectified Flow Adaptation for Robust 3D Object Detection in Autonomous Driving
by: Lin, Hongbin, et al.
Published: (2025)

RobustSplat: Decoupling Densification and Dynamics for Transient-Free 3DGS
by: Fu, Chuanyu, et al.
Published: (2025)

World2VLM: Distilling World Model Imagination into VLMs for Dynamic Spatial Reasoning
by: Zhang, Wanyue, et al.
Published: (2026)

FutureX: Enhance End-to-End Autonomous Driving via Latent Chain-of-Thought World Model
by: Lin, Hongbin, et al.
Published: (2025)

RobustSplat++: Decoupling Densification, Dynamics, and Illumination for In-the-Wild 3DGS
by: Fu, Chuanyu, et al.
Published: (2025)

Fully Test-Time Adaptation for Monocular 3D Object Detection
by: Lin, Hongbin, et al.
Published: (2024)

VectorWorld: Efficient Streaming World Model via Diffusion Flow on Vector Graphs
by: Jiang, Chaokang, et al.
Published: (2026)

SparseWorld: A Flexible, Adaptive, and Efficient 4D Occupancy World Model Powered by Sparse and Dynamic Queries
by: Dang, Chenxu, et al.
Published: (2025)

TeleWorld: Towards Dynamic Multimodal Synthesis with a 4D World Model
by: Chen, Yabo, et al.
Published: (2025)

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds
by: HY-World, Team, et al.
Published: (2026)

FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
by: Guo, Jun, et al.
Published: (2025)

UVLM: Benchmarking Video Language Model for Underwater World Understanding
by: Xue, Xizhe, et al.
Published: (2025)

EyeWorld: A Generative World Model of Ocular State and Dynamics
by: Gao, Ziyu, et al.
Published: (2026)

SyntheOcc: Synthesize Geometric-Controlled Street View Images through 3D Semantic MPIs
by: Li, Leheng, et al.
Published: (2024)

DynFlowDrive: Flow-Based Dynamic World Modeling for Autonomous Driving
by: Liu, Xiaolu, et al.
Published: (2026)

SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Models
by: Zhang, Yue, et al.
Published: (2024)

DreamWorld: Unified World Modeling in Video Generation
by: Tan, Boming, et al.
Published: (2026)

Scene-R1: Video-Grounded Large Language Models for 3D Scene Reasoning without 3D Annotations
by: Yuan, Zhihao, et al.
Published: (2025)

DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation
by: Lin, Hongbin, et al.
Published: (2025)

LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models
by: Duan, Zicheng, et al.
Published: (2026)

RenderWorld: World Model with Self-Supervised 3D Label
by: Yan, Ziyang, et al.
Published: (2024)

Divide and Conquer: Decoupled Representation Alignment for Multimodal World Models
by: Xiao, Junyuan, et al.
Published: (2026)

MAD: Motion Appearance Decoupling for efficient Driving World Models
by: Rahimi, Ahmad, et al.
Published: (2026)

AstraNav-World: World Model for Foresight Control and Consistency
by: Chen, Jintao, et al.
Published: (2025)

$I^{2}$-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting
by: Liao, Zhimin, et al.
Published: (2025)

Can World Models Benefit VLMs for World Dynamics?
by: Zhang, Kevin, et al.
Published: (2025)

Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection
by: Zheng, Chaoda, et al.
Published: (2024)

Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels
by: Lu, Jiahao, et al.
Published: (2026)

Seeing through Imagination: Learning Scene Geometry via Implicit Spatial World Modeling
by: Cao, Meng, et al.
Published: (2025)

WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG
by: Li, Zhen, et al.
Published: (2026)

VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control
by: Zheng, Sixiao, et al.
Published: (2026)

HOLODECK 2.0: Vision-Language-Guided 3D World Generation with Editing
by: Bian, Zixuan, et al.
Published: (2025)