Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Peng, Jiankun, Guo, Jianyuan, Yang, Yiguang, Liu, Yue, Yan, Jiashuang, Xu, Ying
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2605.09053
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910205456941056
author	Peng, Jiankun Guo, Jianyuan Yang, Yiguang Liu, Yue Yan, Jiashuang Xu, Ying
author_facet	Peng, Jiankun Guo, Jianyuan Yang, Yiguang Liu, Yue Yan, Jiashuang Xu, Ying
contents	Online topological planning has become an effective paradigm for Vision-Language Navigation in Continuous Environments (VLN-CE), but existing methods still suffer from two limitations: redundant local depth information and weakened focus on current frontier candidates as the topological graph grows. To address this, we propose LCGNav, a modular local geometric enhancement framework for topological VLN. LCGNav explicitly converts candidate depth views into 3D point clouds and applies physical truncation based on the agent's reachable range, enabling more compact local geometric modeling. It further introduces a dimension-preserving local fusion strategy with transient state degradation, so that geometric enhancement is applied only to the currently relevant ghost nodes without changing the original planner interface. Experiments on R2R-CE and RxR-CE show that LCGNav serves as an effective cross-architecture enhancement module, consistently improving multiple key metrics of representative online topological baselines with low additional training cost. When integrated with ETP-R1, LCGNav achieves the best performance among the compared online topological methods on the val-unseen splits of the R2R-CE and RxR-CE benchmarks. The code is available at https://github.com/shannanshouyin/LCGNav.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_09053
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	LCGNav: Local Candidate-Aware Geometric Enhancement for General Topological Planning in Vision-Language Navigation Peng, Jiankun Guo, Jianyuan Yang, Yiguang Liu, Yue Yan, Jiashuang Xu, Ying Computer Vision and Pattern Recognition Online topological planning has become an effective paradigm for Vision-Language Navigation in Continuous Environments (VLN-CE), but existing methods still suffer from two limitations: redundant local depth information and weakened focus on current frontier candidates as the topological graph grows. To address this, we propose LCGNav, a modular local geometric enhancement framework for topological VLN. LCGNav explicitly converts candidate depth views into 3D point clouds and applies physical truncation based on the agent's reachable range, enabling more compact local geometric modeling. It further introduces a dimension-preserving local fusion strategy with transient state degradation, so that geometric enhancement is applied only to the currently relevant ghost nodes without changing the original planner interface. Experiments on R2R-CE and RxR-CE show that LCGNav serves as an effective cross-architecture enhancement module, consistently improving multiple key metrics of representative online topological baselines with low additional training cost. When integrated with ETP-R1, LCGNav achieves the best performance among the compared online topological methods on the val-unseen splits of the R2R-CE and RxR-CE benchmarks. The code is available at https://github.com/shannanshouyin/LCGNav.
title	LCGNav: Local Candidate-Aware Geometric Enhancement for General Topological Planning in Vision-Language Navigation
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2605.09053

Similar Items