:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Zhang, Yi, Wang, Yunshuang, Zhang, Zeyu, Tang, Hao
Natura:	Preprint
Pubblicazione:	2026
Soggetti:	Computer Vision and Pattern Recognition
Accesso online:	https://arxiv.org/abs/2602.11757
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Code2World: A GUI World Model via Renderable Code Generation
di: Zheng, Yuhao, et al.
Pubblicazione: (2026)

Leveraging LLMs and attention-mechanism for automatic annotation of historical maps
di: Yuan, Yunshuang, et al.
Pubblicazione: (2025)

MWM: Mobile World Models for Action-Conditioned Consistent Prediction
di: Yan, Han, et al.
Pubblicazione: (2026)

DragMesh: Interactive 3D Generation Made Easy
di: Zhang, Tianshan, et al.
Pubblicazione: (2025)

WebCode2M: A Real-World Dataset for Code Generation from Webpage Designs
di: Gui, Yi, et al.
Pubblicazione: (2024)

PartRAG: Retrieval-Augmented Part-Level 3D Generation and Editing
di: Li, Peize, et al.
Pubblicazione: (2026)

Light4D: Training-Free Extreme Viewpoint 4D Video Relighting
di: Wu, Zhenghuang, et al.
Pubblicazione: (2026)

CoSense3D: an Agent-based Efficient Learning Framework for Collective Perception
di: Yuan, Yunshuang, et al.
Pubblicazione: (2024)

Thinking with Spatial Code for Physical-World Video Reasoning
di: Chen, Jieneng, et al.
Pubblicazione: (2026)

GeoWorld: Geometric World Models
di: Zhang, Zeyu, et al.
Pubblicazione: (2026)

LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models
di: Duan, Zicheng, et al.
Pubblicazione: (2026)

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds
di: HY-World, Team, et al.
Pubblicazione: (2026)

UniMesh: Unifying 3D Mesh Understanding and Generation
di: Huang, Peng, et al.
Pubblicazione: (2026)

3D CoCa v2: Contrastive Learners with Test-Time Search for Generalizable Spatial Intelligence
di: Tang, Hao, et al.
Pubblicazione: (2026)

3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding
di: Huang, Ting, et al.
Pubblicazione: (2025)

Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs
di: Zhang, Houston H., et al.
Pubblicazione: (2025)

3D CoCa: Contrastive Learners are 3D Captioners
di: Huang, Ting, et al.
Pubblicazione: (2025)

RTS-Mono: A Real-Time Self-Supervised Monocular Depth Estimation Method for Real-World Deployment
di: Cheng, Zeyu, et al.
Pubblicazione: (2025)

SafeMo: Linguistically Grounded Unlearning for Trustworthy Text-to-Motion Generation
di: Wang, Yiling, et al.
Pubblicazione: (2026)

Follow-Your-Creation: Empowering 4D Creation through Video Inpainting
di: Ma, Yue, et al.
Pubblicazione: (2025)

VaseVQA-3D: Benchmarking 3D VLMs on Ancient Greek Pottery
di: Zhang, Nonghai, et al.
Pubblicazione: (2025)

Generative Visual Code Mobile World Models
di: Koh, Woosung, et al.
Pubblicazione: (2026)

GigaWorld-0: World Models as Data Engine to Empower Embodied AI
di: GigaWorld Team, et al.
Pubblicazione: (2025)

WristWorld: Generating Wrist-Views via 4D World Models for Robotic Manipulation
di: Qian, Zezhong, et al.
Pubblicazione: (2025)

LivingWorld: Interactive 4D World Generation with Environmental Dynamics
di: Mun, Hyeongju, et al.
Pubblicazione: (2026)

MoRL: Reinforced Reasoning for Unified Motion Understanding and Generation
di: Wang, Hongpeng, et al.
Pubblicazione: (2026)

ReMoMask: Retrieval-Augmented Masked Motion Generation
di: Li, Zhengdao, et al.
Pubblicazione: (2025)

WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation
di: Lu, Jiachen, et al.
Pubblicazione: (2023)

StereoAdapter-2: Globally Structure-Consistent Underwater Stereo Depth Estimation
di: Ren, Zeyu, et al.
Pubblicazione: (2026)

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation
di: Wang, Weijie, et al.
Pubblicazione: (2026)

DreamWorld: Unified World Modeling in Video Generation
di: Tan, Boming, et al.
Pubblicazione: (2026)

WorldGrow: Generating Infinite 3D World
di: Li, Sikuang, et al.
Pubblicazione: (2025)

UI2Code^N: UI-to-Code Generation as Interactive Visual Optimization
di: Yang, Zhen, et al.
Pubblicazione: (2025)

Iterative Predictor-Critic Code Decoding for Real-World Image Dehazing
di: Fu, Jiayi, et al.
Pubblicazione: (2025)

EvoWorld: Evolving Panoramic World Generation with Explicit 3D Memory
di: Wang, Jiahao, et al.
Pubblicazione: (2025)

HSG: Hyperbolic Scene Graph
di: Wang, Liyang, et al.
Pubblicazione: (2026)

TeleWorld: Towards Dynamic Multimodal Synthesis with a 4D World Model
di: Chen, Yabo, et al.
Pubblicazione: (2025)

StreamLTS: Query-based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection
di: Yuan, Yunshuang, et al.
Pubblicazione: (2024)

HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels
di: HunyuanWorld Team, et al.
Pubblicazione: (2025)

Generative World Renderer
di: Huang, Zheng-Hui, et al.
Pubblicazione: (2026)