Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.12971 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866915960333533184 |
|---|---|
| author | Fang, YukTungSamuel Shi, Zhikang Qiu, Jiabin Chen, Zixuan Shi, Jieqi Xu, Hao Huo, Jing Gao, Yang |
| author_facet | Fang, YukTungSamuel Shi, Zhikang Qiu, Jiabin Chen, Zixuan Shi, Jieqi Xu, Hao Huo, Jing Gao, Yang |
| contents | Driven by recent advancements in foundation models, semantic scene graphs have emerged as a promising paradigm for high-level 3D environmental abstraction in robot navigation. However, existing frameworks struggle to successfully handle complex embodied queries while ensuring continuous semantic graph construction. To address these limitations, we present INHerit-SG, an asynchronous dual-stream architecture that systematically structures the 3D environment into a RAG-ready knowledge base. Specifically, our framework integrates comprehensive node representations, an event-triggered asynchronous update scheme, and a structured retrieval mechanism. While geometric segmentation is decoupled from semantic reasoning to maintain mapping efficiency, the semantic nodes also store natural language summaries to support text-based retrieval. Furthermore, we propose an interpretable retrieval pipeline that couples the reasoning capabilities of multi-role LLMs with the topological structure of the scene graph, followed by a visual verification process to mitigate false positives. We evaluate INHerit-SG on a newly constructed benchmark for complex embodied semantic query retrieval, HM3DSem-SQR, and in real-world environments. Experiments demonstrate that our system achieves state-of-the-art performance on complex queries, especially for those involving negations and chained spatial constraints. Project Page: https://fangyuktung.github.io/INHeritSG.github.io/ |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2602_12971 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | INHerit-SG: Incremental Hierarchical Semantic Scene Graphs with RAG-Style Retrieval Fang, YukTungSamuel Shi, Zhikang Qiu, Jiabin Chen, Zixuan Shi, Jieqi Xu, Hao Huo, Jing Gao, Yang Robotics Driven by recent advancements in foundation models, semantic scene graphs have emerged as a promising paradigm for high-level 3D environmental abstraction in robot navigation. However, existing frameworks struggle to successfully handle complex embodied queries while ensuring continuous semantic graph construction. To address these limitations, we present INHerit-SG, an asynchronous dual-stream architecture that systematically structures the 3D environment into a RAG-ready knowledge base. Specifically, our framework integrates comprehensive node representations, an event-triggered asynchronous update scheme, and a structured retrieval mechanism. While geometric segmentation is decoupled from semantic reasoning to maintain mapping efficiency, the semantic nodes also store natural language summaries to support text-based retrieval. Furthermore, we propose an interpretable retrieval pipeline that couples the reasoning capabilities of multi-role LLMs with the topological structure of the scene graph, followed by a visual verification process to mitigate false positives. We evaluate INHerit-SG on a newly constructed benchmark for complex embodied semantic query retrieval, HM3DSem-SQR, and in real-world environments. Experiments demonstrate that our system achieves state-of-the-art performance on complex queries, especially for those involving negations and chained spatial constraints. Project Page: https://fangyuktung.github.io/INHeritSG.github.io/ |
| title | INHerit-SG: Incremental Hierarchical Semantic Scene Graphs with RAG-Style Retrieval |
| topic | Robotics |
| url | https://arxiv.org/abs/2602.12971 |