Saved in:
Bibliographic Details
Main Authors: Fang, YukTungSamuel, Shi, Zhikang, Qiu, Jiabin, Chen, Zixuan, Shi, Jieqi, Xu, Hao, Huo, Jing, Gao, Yang
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.12971
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915960333533184
author Fang, YukTungSamuel
Shi, Zhikang
Qiu, Jiabin
Chen, Zixuan
Shi, Jieqi
Xu, Hao
Huo, Jing
Gao, Yang
author_facet Fang, YukTungSamuel
Shi, Zhikang
Qiu, Jiabin
Chen, Zixuan
Shi, Jieqi
Xu, Hao
Huo, Jing
Gao, Yang
contents Driven by recent advancements in foundation models, semantic scene graphs have emerged as a promising paradigm for high-level 3D environmental abstraction in robot navigation. However, existing frameworks struggle to successfully handle complex embodied queries while ensuring continuous semantic graph construction. To address these limitations, we present INHerit-SG, an asynchronous dual-stream architecture that systematically structures the 3D environment into a RAG-ready knowledge base. Specifically, our framework integrates comprehensive node representations, an event-triggered asynchronous update scheme, and a structured retrieval mechanism. While geometric segmentation is decoupled from semantic reasoning to maintain mapping efficiency, the semantic nodes also store natural language summaries to support text-based retrieval. Furthermore, we propose an interpretable retrieval pipeline that couples the reasoning capabilities of multi-role LLMs with the topological structure of the scene graph, followed by a visual verification process to mitigate false positives. We evaluate INHerit-SG on a newly constructed benchmark for complex embodied semantic query retrieval, HM3DSem-SQR, and in real-world environments. Experiments demonstrate that our system achieves state-of-the-art performance on complex queries, especially for those involving negations and chained spatial constraints. Project Page: https://fangyuktung.github.io/INHeritSG.github.io/
format Preprint
id arxiv_https___arxiv_org_abs_2602_12971
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle INHerit-SG: Incremental Hierarchical Semantic Scene Graphs with RAG-Style Retrieval
Fang, YukTungSamuel
Shi, Zhikang
Qiu, Jiabin
Chen, Zixuan
Shi, Jieqi
Xu, Hao
Huo, Jing
Gao, Yang
Robotics
Driven by recent advancements in foundation models, semantic scene graphs have emerged as a promising paradigm for high-level 3D environmental abstraction in robot navigation. However, existing frameworks struggle to successfully handle complex embodied queries while ensuring continuous semantic graph construction. To address these limitations, we present INHerit-SG, an asynchronous dual-stream architecture that systematically structures the 3D environment into a RAG-ready knowledge base. Specifically, our framework integrates comprehensive node representations, an event-triggered asynchronous update scheme, and a structured retrieval mechanism. While geometric segmentation is decoupled from semantic reasoning to maintain mapping efficiency, the semantic nodes also store natural language summaries to support text-based retrieval. Furthermore, we propose an interpretable retrieval pipeline that couples the reasoning capabilities of multi-role LLMs with the topological structure of the scene graph, followed by a visual verification process to mitigate false positives. We evaluate INHerit-SG on a newly constructed benchmark for complex embodied semantic query retrieval, HM3DSem-SQR, and in real-world environments. Experiments demonstrate that our system achieves state-of-the-art performance on complex queries, especially for those involving negations and chained spatial constraints. Project Page: https://fangyuktung.github.io/INHeritSG.github.io/
title INHerit-SG: Incremental Hierarchical Semantic Scene Graphs with RAG-Style Retrieval
topic Robotics
url https://arxiv.org/abs/2602.12971