:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Liyang, Zhang, Zeyu, Tang, Hao
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2604.17454
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding
by: Huang, Ting, et al.
Published: (2025)

Nav-R1: Reasoning and Navigation in Embodied Scenes
by: Liu, Qingxiang, et al.
Published: (2025)

StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes
by: Wu, Zhengri, et al.
Published: (2025)

DC-Scene: Data-Centric Learning for 3D Scene Understanding
by: Huang, Ting, et al.
Published: (2025)

StereoAdapter-2: Globally Structure-Consistent Underwater Stereo Depth Estimation
by: Ren, Zeyu, et al.
Published: (2026)

Multiview Scene Graph
by: Zhang, Juexiao, et al.
Published: (2024)

DragMesh: Interactive 3D Generation Made Easy
by: Zhang, Tianshan, et al.
Published: (2025)

Code2Worlds: Empowering Coding LLMs for 4D World Generation
by: Zhang, Yi, et al.
Published: (2026)

SafeMo: Linguistically Grounded Unlearning for Trustworthy Text-to-Motion Generation
by: Wang, Yiling, et al.
Published: (2026)

PartRAG: Retrieval-Augmented Part-Level 3D Generation and Editing
by: Li, Peize, et al.
Published: (2026)

3D CoCa v2: Contrastive Learners with Test-Time Search for Generalizable Spatial Intelligence
by: Tang, Hao, et al.
Published: (2026)

MoRL: Reinforced Reasoning for Unified Motion Understanding and Generation
by: Wang, Hongpeng, et al.
Published: (2026)

3D CoCa: Contrastive Learners are 3D Captioners
by: Huang, Ting, et al.
Published: (2025)

ReMoMask: Retrieval-Augmented Masked Motion Generation
by: Li, Zhengdao, et al.
Published: (2025)

Universal Scene Graph Generation
by: Wu, Shengqiong, et al.
Published: (2025)

AnyDepth: Depth Estimation Made Easy
by: Ren, Zeyu, et al.
Published: (2026)

BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation
by: Hao, Peng, et al.
Published: (2024)

GeoSceneGraph: Geometric Scene Graph Diffusion Model for Text-guided 3D Indoor Scene Synthesis
by: Ruiz, Antonio, et al.
Published: (2025)

FreeQ-Graph: Free-form Querying with Semantic Consistent Scene Graph for 3D Scene Understanding
by: Zhan, Chenlu, et al.
Published: (2025)

VaseVQA-3D: Benchmarking 3D VLMs on Ancient Greek Pottery
by: Zhang, Nonghai, et al.
Published: (2025)

FDSG: Forecasting Dynamic Scene Graphs
by: Yang, Yi, et al.
Published: (2025)

Light4D: Training-Free Extreme Viewpoint 4D Video Relighting
by: Wu, Zhenghuang, et al.
Published: (2026)

UniMesh: Unifying 3D Mesh Understanding and Generation
by: Huang, Peng, et al.
Published: (2026)

MMA: Multimodal Memory Agent
by: Lu, Yihao, et al.
Published: (2026)

A Hyperbolic Perspective on Hierarchical Structure in Object-Centric Scene Representations
by: Madan, Neelu, et al.
Published: (2026)

EvoVLA: Self-Evolving Vision-Language-Action Model
by: Liu, Zeting, et al.
Published: (2025)

Graph-Guided Scene Reconstruction from Images with 3D Gaussian Splatting
by: Cheng, Chong, et al.
Published: (2025)

Monocular Open Vocabulary Occupancy Prediction for Indoor Scenes
by: Zhou, Changqing, et al.
Published: (2026)

Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene
by: Wu, Shengqiong, et al.
Published: (2025)

Adaptive Visual Scene Understanding: Incremental Scene Graph Generation
by: Khandelwal, Naitik, et al.
Published: (2023)

H2G: Hierarchy-Aware Hyperbolic Grouping for 3D Scenes
by: Ko, ByungHa, et al.
Published: (2026)

Controllable 3D Outdoor Scene Generation via Scene Graphs
by: Liu, Yuheng, et al.
Published: (2025)

Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting
by: Yang, Zeyu, et al.
Published: (2023)

Lite3R: A Model-Agnostic Framework for Efficient Feed-Forward 3D Reconstruction
by: Zhang, Haoyu, et al.
Published: (2026)

MWM: Mobile World Models for Action-Conditioned Consistent Prediction
by: Yan, Han, et al.
Published: (2026)

GeneralVLA: Generalizable Vision-Language-Action Models with Knowledge-Guided Trajectory Planning
by: Ma, Guoqing, et al.
Published: (2026)

Toward Scene Graph and Layout Guided Complex 3D Scene Generation
by: Huang, Yu-Hsiang, et al.
Published: (2024)

WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation
by: Lu, Jiachen, et al.
Published: (2023)

GaussianGraph: 3D Gaussian-based Scene Graph Generation for Open-world Scene Understanding
by: Wang, Xihan, et al.
Published: (2025)

Unbiased Scene Graph Generation from Biased Training
by: Tang, Kaihua, et al.
Published: (2020)