Saved in:
| Main Authors: | Zheng, Hongpei, Li, Shijie, Li, Yanran, Yin, Hujun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.03284 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Reg3D: Reconstructive Geometry Instruction Tuning for 3D Scene Understanding
by: Zheng, Hongpei, et al.
Published: (2025)
by: Zheng, Hongpei, et al.
Published: (2025)
SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning
by: Ma, Wufei, et al.
Published: (2025)
by: Ma, Wufei, et al.
Published: (2025)
Geometric Prior-Guided Neural Implicit Surface Reconstruction in the Wild
by: Xiang, Lintao, et al.
Published: (2025)
by: Xiang, Lintao, et al.
Published: (2025)
PointGS: Point Attention-Aware Sparse View Synthesis with Gaussian Splatting
by: Xiang, Lintao, et al.
Published: (2025)
by: Xiang, Lintao, et al.
Published: (2025)
IntentionNav: A Benchmark for Intent-Driven Object Navigation from Implicit Human Instruction
by: Qian, Lin, et al.
Published: (2026)
by: Qian, Lin, et al.
Published: (2026)
NIS-SLAM: Neural Implicit Semantic RGB-D SLAM for 3D Consistent Scene Understanding
by: Zhai, Hongjia, et al.
Published: (2024)
by: Zhai, Hongjia, et al.
Published: (2024)
Enhancing MLLM Spatial Understanding via Active 3D Scene Exploration for Multi-Perspective Reasoning
by: Chen, Jiahua, et al.
Published: (2026)
by: Chen, Jiahua, et al.
Published: (2026)
SURPRISE3D: A Dataset for Spatial Understanding and Reasoning in Complex 3D Scenes
by: Huang, Jiaxin, et al.
Published: (2025)
by: Huang, Jiaxin, et al.
Published: (2025)
3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation
by: Zhang, Frank, et al.
Published: (2024)
by: Zhang, Frank, et al.
Published: (2024)
Grounding by Remembering: Cross-Scene and In-Scene Memory for 3D Functional Affordances
by: Wang, Qirui, et al.
Published: (2026)
by: Wang, Qirui, et al.
Published: (2026)
SpatialStack: Layered Geometry-Language Fusion for 3D VLM Spatial Reasoning
by: Zhang, Jian, et al.
Published: (2026)
by: Zhang, Jian, et al.
Published: (2026)
HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation
by: Dong, Wenqi, et al.
Published: (2025)
by: Dong, Wenqi, et al.
Published: (2025)
SGFormer: Satellite-Ground Fusion for 3D Semantic Scene Completion
by: Guo, Xiyue, et al.
Published: (2025)
by: Guo, Xiyue, et al.
Published: (2025)
GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding
by: Li, Hao, et al.
Published: (2023)
by: Li, Hao, et al.
Published: (2023)
SpatialAct: Probing Spatial Reasoning-to-Action Capabilities of VLM Agents in 3D Scenes
by: Liu, Tianhui, et al.
Published: (2026)
by: Liu, Tianhui, et al.
Published: (2026)
CLEAR-IR: Clarity-Enhanced Active Reconstruction of Infrared Imagery
by: Shankar, Nathan, et al.
Published: (2025)
by: Shankar, Nathan, et al.
Published: (2025)
InstaScene: Towards Complete 3D Instance Decomposition and Reconstruction from Cluttered Scenes
by: Yang, Zesong, et al.
Published: (2025)
by: Yang, Zesong, et al.
Published: (2025)
Intelligent Spatial Perception by Building Hierarchical 3D Scene Graphs for Indoor Scenarios with the Help of LLMs
by: Cheng, Yao, et al.
Published: (2025)
by: Cheng, Yao, et al.
Published: (2025)
COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence
by: Zhang, Zefeng, et al.
Published: (2025)
by: Zhang, Zefeng, et al.
Published: (2025)
Masking Matters: Unlocking the Spatial Reasoning Capabilities of LLMs for 3D Scene-Language Understanding
by: Jeon, Yerim, et al.
Published: (2025)
by: Jeon, Yerim, et al.
Published: (2025)
HMR3D: Hierarchical Multimodal Representation for 3D Scene Understanding with Large Vision-Language Model
by: Li, Chen, et al.
Published: (2025)
by: Li, Chen, et al.
Published: (2025)
Scene-R1: Video-Grounded Large Language Models for 3D Scene Reasoning without 3D Annotations
by: Yuan, Zhihao, et al.
Published: (2025)
by: Yuan, Zhihao, et al.
Published: (2025)
A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning
by: Jiang, Siyang, et al.
Published: (2025)
by: Jiang, Siyang, et al.
Published: (2025)
Generating Human Motion in 3D Scenes from Text Descriptions
by: Cen, Zhi, et al.
Published: (2024)
by: Cen, Zhi, et al.
Published: (2024)
Leveraging 2D-VLM for Label-Free 3D Segmentation in Large-Scale Outdoor Scene Understanding
by: Nishimura, Toshihiko, et al.
Published: (2026)
by: Nishimura, Toshihiko, et al.
Published: (2026)
SpatialCrafter: Unleashing the Imagination of Video Diffusion Models for Scene Reconstruction from Limited Observations
by: Zhang, Songchun, et al.
Published: (2025)
by: Zhang, Songchun, et al.
Published: (2025)
PathoHR: Hierarchical Reasoning for Vision-Language Models in Pathology
by: Huang, Yating, et al.
Published: (2025)
by: Huang, Yating, et al.
Published: (2025)
S3R-GS: Streamlining the Pipeline for Large-Scale Street Scene Reconstruction
by: Zheng, Guangting, et al.
Published: (2025)
by: Zheng, Guangting, et al.
Published: (2025)
Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding
by: Wang, Ziyang, et al.
Published: (2025)
by: Wang, Ziyang, et al.
Published: (2025)
Boosting MLLM Spatial Reasoning with Geometrically Referenced 3D Scene Representations
by: Yuan, Jiangye, et al.
Published: (2026)
by: Yuan, Jiangye, et al.
Published: (2026)
SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving
by: Li, Yiming, et al.
Published: (2023)
by: Li, Yiming, et al.
Published: (2023)
3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding
by: Huang, Ting, et al.
Published: (2025)
by: Huang, Ting, et al.
Published: (2025)
Open-Vocabulary Octree-Graph for 3D Scene Understanding
by: Wang, Zhigang, et al.
Published: (2024)
by: Wang, Zhigang, et al.
Published: (2024)
SAM-Guided Masked Token Prediction for 3D Scene Understanding
by: Chen, Zhimin, et al.
Published: (2024)
by: Chen, Zhimin, et al.
Published: (2024)
DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering
by: Luo, Jingzhou, et al.
Published: (2025)
by: Luo, Jingzhou, et al.
Published: (2025)
Spatial As Deep: Spatial CNN for Traffic Scene Understanding
by: Pan, Xingang, et al.
Published: (2017)
by: Pan, Xingang, et al.
Published: (2017)
IL3D: A Large-Scale Indoor Layout Dataset for LLM-Driven 3D Scene Generation
by: Zhou, Wenxu, et al.
Published: (2025)
by: Zhou, Wenxu, et al.
Published: (2025)
SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery
by: Cao, Meng, et al.
Published: (2025)
by: Cao, Meng, et al.
Published: (2025)
R2G: Reasoning to Ground in 3D Scenes
by: Li, Yixuan, et al.
Published: (2024)
by: Li, Yixuan, et al.
Published: (2024)
Think3D: Thinking with Space for Spatial Reasoning
by: Zhang, Zaibin, et al.
Published: (2026)
by: Zhang, Zaibin, et al.
Published: (2026)
Similar Items
-
Reg3D: Reconstructive Geometry Instruction Tuning for 3D Scene Understanding
by: Zheng, Hongpei, et al.
Published: (2025) -
SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning
by: Ma, Wufei, et al.
Published: (2025) -
Geometric Prior-Guided Neural Implicit Surface Reconstruction in the Wild
by: Xiang, Lintao, et al.
Published: (2025) -
PointGS: Point Attention-Aware Sparse View Synthesis with Gaussian Splatting
by: Xiang, Lintao, et al.
Published: (2025) -
IntentionNav: A Benchmark for Intent-Driven Object Navigation from Implicit Human Instruction
by: Qian, Lin, et al.
Published: (2026)