Sisdoallologahallan: :: Library Catalog

Furkejuvvon:

Bibliográfalaš dieđut
Váldodahkkit:	Zhu, Ziyue, Wu, Zhanqian, Zhu, Zhenxin, Zhou, Lijun, Sun, Haiyang, Wan, Bing, Ma, Kun, Chen, Guang, Ye, Hangjun, Xie, Jin, Yang, jian
Materiálatiipa:	Preprint
Almmustuhtton:	2025
Fáttát:	Computer Vision and Pattern Recognition
Liŋkkat:	https://arxiv.org/abs/2509.23402
Fáddágilkorat:	Lasit fáddágilkoriid Eai fáddágilkorat, Lasit vuosttaš fáddágilkora!

Sisdoallologahallan:

Recent advances in driving-scene generation and reconstruction have demonstrated significant potential for enhancing autonomous driving systems by producing scalable and controllable training data. Existing generation methods primarily focus on synthesizing diverse and high-fidelity driving videos; however, due to limited 3D consistency and sparse viewpoint coverage, they struggle to support convenient and high-quality novel-view synthesis (NVS). Conversely, recent 3D/4D reconstruction approaches have significantly improved NVS for real-world driving scenes, yet inherently lack generative capabilities. To overcome this dilemma between scene generation and reconstruction, we propose WorldSplat, a novel feed-forward framework for 4D driving-scene generation. Our approach effectively generates consistent multi-track videos through two key steps: (i) We introduce a 4D-aware latent diffusion model integrating multi-modal information to produce pixel-aligned 4D Gaussians in a feed-forward manner. (ii) Subsequently, we refine the novel view videos rendered from these Gaussians using a enhanced video diffusion model. Extensive experiments conducted on benchmark datasets demonstrate that WorldSplat effectively generates high-fidelity, temporally and spatially consistent multi-track novel view driving videos. Project: https://wm-research.github.io/worldsplat/

Geahča maid