Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Yuan, Yu, Wickremasinghe, Tharindu, Nadir, Zeeshan, Wang, Xijun, Chi, Yiheng, Chan, Stanley H.
Formato:	Preprint
Publicado:	2025
Materias:	Computer Vision and Pattern Recognition
Acceso en línea:	https://arxiv.org/abs/2512.03350
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866917365319467008
author	Yuan, Yu Wickremasinghe, Tharindu Nadir, Zeeshan Wang, Xijun Chi, Yiheng Chan, Stanley H.
author_facet	Yuan, Yu Wickremasinghe, Tharindu Nadir, Zeeshan Wang, Xijun Chi, Yiheng Chan, Stanley H.
contents	Images and videos are discrete 2D projections of the 4D world (3D space + time). Most visual understanding, prediction, and generation operate directly on 2D observations, leading to suboptimal performance. We propose SeeU, a novel approach that learns the continuous 4D dynamics and generate the unseen visual contents. The principle behind SeeU is a new 2D$\to$4D$\to$2D learning framework. SeeU first reconstructs the 4D world from sparse and monocular 2D frames (2D$\to$4D). It then learns the continuous 4D dynamics on a low-rank representation and physical constraints (discrete 4D$\to$continuous 4D). Finally, SeeU rolls the world forward in time, re-projects it back to 2D at sampled times and viewpoints, and generates unseen regions based on spatial-temporal context awareness (4D$\to$2D). By modeling dynamics in 4D, SeeU achieves continuous and physically-consistent novel visual generation, demonstrating strong potentials in multiple tasks including unseen temporal generation, unseen spatial generation, and video editing. All data and code will be public at https://yuyuanspace.com/SeeU/
format	Preprint
id	arxiv_https___arxiv_org_abs_2512_03350
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	SeeU: Seeing the Unseen World via 4D Dynamics-aware Generation Yuan, Yu Wickremasinghe, Tharindu Nadir, Zeeshan Wang, Xijun Chi, Yiheng Chan, Stanley H. Computer Vision and Pattern Recognition Images and videos are discrete 2D projections of the 4D world (3D space + time). Most visual understanding, prediction, and generation operate directly on 2D observations, leading to suboptimal performance. We propose SeeU, a novel approach that learns the continuous 4D dynamics and generate the unseen visual contents. The principle behind SeeU is a new 2D$\to$4D$\to$2D learning framework. SeeU first reconstructs the 4D world from sparse and monocular 2D frames (2D$\to$4D). It then learns the continuous 4D dynamics on a low-rank representation and physical constraints (discrete 4D$\to$continuous 4D). Finally, SeeU rolls the world forward in time, re-projects it back to 2D at sampled times and viewpoints, and generates unseen regions based on spatial-temporal context awareness (4D$\to$2D). By modeling dynamics in 4D, SeeU achieves continuous and physically-consistent novel visual generation, demonstrating strong potentials in multiple tasks including unseen temporal generation, unseen spatial generation, and video editing. All data and code will be public at https://yuyuanspace.com/SeeU/
title	SeeU: Seeing the Unseen World via 4D Dynamics-aware Generation
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2512.03350

Ejemplares similares