Saved in:
| Main Authors: | Wang, Liyang, Zhang, Zeyu, Tang, Hao |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.17454 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding
by: Huang, Ting, et al.
Published: (2025)
by: Huang, Ting, et al.
Published: (2025)
Nav-R1: Reasoning and Navigation in Embodied Scenes
by: Liu, Qingxiang, et al.
Published: (2025)
by: Liu, Qingxiang, et al.
Published: (2025)
StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes
by: Wu, Zhengri, et al.
Published: (2025)
by: Wu, Zhengri, et al.
Published: (2025)
DC-Scene: Data-Centric Learning for 3D Scene Understanding
by: Huang, Ting, et al.
Published: (2025)
by: Huang, Ting, et al.
Published: (2025)
StereoAdapter-2: Globally Structure-Consistent Underwater Stereo Depth Estimation
by: Ren, Zeyu, et al.
Published: (2026)
by: Ren, Zeyu, et al.
Published: (2026)
Multiview Scene Graph
by: Zhang, Juexiao, et al.
Published: (2024)
by: Zhang, Juexiao, et al.
Published: (2024)
DragMesh: Interactive 3D Generation Made Easy
by: Zhang, Tianshan, et al.
Published: (2025)
by: Zhang, Tianshan, et al.
Published: (2025)
Code2Worlds: Empowering Coding LLMs for 4D World Generation
by: Zhang, Yi, et al.
Published: (2026)
by: Zhang, Yi, et al.
Published: (2026)
SafeMo: Linguistically Grounded Unlearning for Trustworthy Text-to-Motion Generation
by: Wang, Yiling, et al.
Published: (2026)
by: Wang, Yiling, et al.
Published: (2026)
PartRAG: Retrieval-Augmented Part-Level 3D Generation and Editing
by: Li, Peize, et al.
Published: (2026)
by: Li, Peize, et al.
Published: (2026)
3D CoCa v2: Contrastive Learners with Test-Time Search for Generalizable Spatial Intelligence
by: Tang, Hao, et al.
Published: (2026)
by: Tang, Hao, et al.
Published: (2026)
MoRL: Reinforced Reasoning for Unified Motion Understanding and Generation
by: Wang, Hongpeng, et al.
Published: (2026)
by: Wang, Hongpeng, et al.
Published: (2026)
3D CoCa: Contrastive Learners are 3D Captioners
by: Huang, Ting, et al.
Published: (2025)
by: Huang, Ting, et al.
Published: (2025)
ReMoMask: Retrieval-Augmented Masked Motion Generation
by: Li, Zhengdao, et al.
Published: (2025)
by: Li, Zhengdao, et al.
Published: (2025)
Universal Scene Graph Generation
by: Wu, Shengqiong, et al.
Published: (2025)
by: Wu, Shengqiong, et al.
Published: (2025)
AnyDepth: Depth Estimation Made Easy
by: Ren, Zeyu, et al.
Published: (2026)
by: Ren, Zeyu, et al.
Published: (2026)
BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation
by: Hao, Peng, et al.
Published: (2024)
by: Hao, Peng, et al.
Published: (2024)
GeoSceneGraph: Geometric Scene Graph Diffusion Model for Text-guided 3D Indoor Scene Synthesis
by: Ruiz, Antonio, et al.
Published: (2025)
by: Ruiz, Antonio, et al.
Published: (2025)
FreeQ-Graph: Free-form Querying with Semantic Consistent Scene Graph for 3D Scene Understanding
by: Zhan, Chenlu, et al.
Published: (2025)
by: Zhan, Chenlu, et al.
Published: (2025)
VaseVQA-3D: Benchmarking 3D VLMs on Ancient Greek Pottery
by: Zhang, Nonghai, et al.
Published: (2025)
by: Zhang, Nonghai, et al.
Published: (2025)
FDSG: Forecasting Dynamic Scene Graphs
by: Yang, Yi, et al.
Published: (2025)
by: Yang, Yi, et al.
Published: (2025)
Light4D: Training-Free Extreme Viewpoint 4D Video Relighting
by: Wu, Zhenghuang, et al.
Published: (2026)
by: Wu, Zhenghuang, et al.
Published: (2026)
UniMesh: Unifying 3D Mesh Understanding and Generation
by: Huang, Peng, et al.
Published: (2026)
by: Huang, Peng, et al.
Published: (2026)
MMA: Multimodal Memory Agent
by: Lu, Yihao, et al.
Published: (2026)
by: Lu, Yihao, et al.
Published: (2026)
A Hyperbolic Perspective on Hierarchical Structure in Object-Centric Scene Representations
by: Madan, Neelu, et al.
Published: (2026)
by: Madan, Neelu, et al.
Published: (2026)
EvoVLA: Self-Evolving Vision-Language-Action Model
by: Liu, Zeting, et al.
Published: (2025)
by: Liu, Zeting, et al.
Published: (2025)
Graph-Guided Scene Reconstruction from Images with 3D Gaussian Splatting
by: Cheng, Chong, et al.
Published: (2025)
by: Cheng, Chong, et al.
Published: (2025)
Monocular Open Vocabulary Occupancy Prediction for Indoor Scenes
by: Zhou, Changqing, et al.
Published: (2026)
by: Zhou, Changqing, et al.
Published: (2026)
Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene
by: Wu, Shengqiong, et al.
Published: (2025)
by: Wu, Shengqiong, et al.
Published: (2025)
Adaptive Visual Scene Understanding: Incremental Scene Graph Generation
by: Khandelwal, Naitik, et al.
Published: (2023)
by: Khandelwal, Naitik, et al.
Published: (2023)
H2G: Hierarchy-Aware Hyperbolic Grouping for 3D Scenes
by: Ko, ByungHa, et al.
Published: (2026)
by: Ko, ByungHa, et al.
Published: (2026)
Controllable 3D Outdoor Scene Generation via Scene Graphs
by: Liu, Yuheng, et al.
Published: (2025)
by: Liu, Yuheng, et al.
Published: (2025)
Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting
by: Yang, Zeyu, et al.
Published: (2023)
by: Yang, Zeyu, et al.
Published: (2023)
Lite3R: A Model-Agnostic Framework for Efficient Feed-Forward 3D Reconstruction
by: Zhang, Haoyu, et al.
Published: (2026)
by: Zhang, Haoyu, et al.
Published: (2026)
MWM: Mobile World Models for Action-Conditioned Consistent Prediction
by: Yan, Han, et al.
Published: (2026)
by: Yan, Han, et al.
Published: (2026)
GeneralVLA: Generalizable Vision-Language-Action Models with Knowledge-Guided Trajectory Planning
by: Ma, Guoqing, et al.
Published: (2026)
by: Ma, Guoqing, et al.
Published: (2026)
Toward Scene Graph and Layout Guided Complex 3D Scene Generation
by: Huang, Yu-Hsiang, et al.
Published: (2024)
by: Huang, Yu-Hsiang, et al.
Published: (2024)
WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation
by: Lu, Jiachen, et al.
Published: (2023)
by: Lu, Jiachen, et al.
Published: (2023)
GaussianGraph: 3D Gaussian-based Scene Graph Generation for Open-world Scene Understanding
by: Wang, Xihan, et al.
Published: (2025)
by: Wang, Xihan, et al.
Published: (2025)
Unbiased Scene Graph Generation from Biased Training
by: Tang, Kaihua, et al.
Published: (2020)
by: Tang, Kaihua, et al.
Published: (2020)
Similar Items
-
3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding
by: Huang, Ting, et al.
Published: (2025) -
Nav-R1: Reasoning and Navigation in Embodied Scenes
by: Liu, Qingxiang, et al.
Published: (2025) -
StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes
by: Wu, Zhengri, et al.
Published: (2025) -
DC-Scene: Data-Centric Learning for 3D Scene Understanding
by: Huang, Ting, et al.
Published: (2025) -
StereoAdapter-2: Globally Structure-Consistent Underwater Stereo Depth Estimation
by: Ren, Zeyu, et al.
Published: (2026)