Saved in:
| Main Authors: | Xu, Haoxuan, Li, Tianfu, Chen, Wenbo, Liu, Yi, Wu, Jin, Lei, Huashuo, Lou, Yunfan, Wang, Lujia, Wang, Hesheng, Li, Haoang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.13321 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
P$^{3}$Nav: End-to-End Perception, Prediction and Planning for Vision-and-Language Navigation
by: Li, Tianfu, et al.
Published: (2026)
by: Li, Tianfu, et al.
Published: (2026)
Enhancing Vision-Language Navigation with Multimodal Event Knowledge from Real-World Indoor Tour Videos
by: Xu, Haoxuan, et al.
Published: (2026)
by: Xu, Haoxuan, et al.
Published: (2026)
FlowVLA: Visual Chain of Thought-based Motion Reasoning for Vision-Language-Action Models
by: Zhong, Zhide, et al.
Published: (2025)
by: Zhong, Zhide, et al.
Published: (2025)
SEDualVLN: A Spatially-Enhanced Dual-System for Vision-Language Navigation
by: Huang, Jingzhi, et al.
Published: (2026)
by: Huang, Jingzhi, et al.
Published: (2026)
Progress-Think: Semantic Progress Reasoning for Vision-Language Navigation
by: Wang, Shuo, et al.
Published: (2025)
by: Wang, Shuo, et al.
Published: (2025)
Efficient Long-Horizon Vision-Language-Action Models via Static-Dynamic Disentanglement
by: Qiu, Weikang, et al.
Published: (2026)
by: Qiu, Weikang, et al.
Published: (2026)
SkyVLN: Vision-and-Language Navigation and NMPC Control for UAVs in Urban Environments
by: Li, Tianshun, et al.
Published: (2025)
by: Li, Tianshun, et al.
Published: (2025)
Towards the Vision-Sound-Language-Action Paradigm: The HEAR Framework for Sound-Centric Manipulation
by: Nie, Chang, et al.
Published: (2026)
by: Nie, Chang, et al.
Published: (2026)
Rethinking the Practicality of Vision-language-action Model: A Comprehensive Benchmark and An Improved Baseline
by: Song, Wenxuan, et al.
Published: (2026)
by: Song, Wenxuan, et al.
Published: (2026)
Planning from Imagination: Episodic Simulation and Episodic Memory for Vision-and-Language Navigation
by: Pan, Yiyuan, et al.
Published: (2024)
by: Pan, Yiyuan, et al.
Published: (2024)
PG-SLAM: Photo-realistic and Geometry-aware RGB-D SLAM in Dynamic Environments
by: Li, Haoang, et al.
Published: (2024)
by: Li, Haoang, et al.
Published: (2024)
DyGeoVLN: Infusing Dynamic Geometry Foundation Model into Vision-Language Navigation
by: Liu, Xiangchen, et al.
Published: (2026)
by: Liu, Xiangchen, et al.
Published: (2026)
AERR-Nav: Adaptive Exploration-Recovery-Reminiscing Strategy for Zero-Shot Object Navigation
by: Huang, Jingzhi, et al.
Published: (2026)
by: Huang, Jingzhi, et al.
Published: (2026)
PD-VLA: Accelerating Vision-Language-Action Model Integrated with Action Chunking via Parallel Decoding
by: Song, Wenxuan, et al.
Published: (2025)
by: Song, Wenxuan, et al.
Published: (2025)
Real-Time Metric-Semantic Mapping for Autonomous Navigation in Outdoor Environments
by: Jiao, Jianhao, et al.
Published: (2024)
by: Jiao, Jianhao, et al.
Published: (2024)
Aux-Think: Exploring Reasoning Strategies for Data-Efficient Vision-Language Navigation
by: Wang, Shuo, et al.
Published: (2025)
by: Wang, Shuo, et al.
Published: (2025)
RationalVLA: A Rational Vision-Language-Action Model with Dual System
by: Song, Wenxuan, et al.
Published: (2025)
by: Song, Wenxuan, et al.
Published: (2025)
RoboMemArena: A Comprehensive and Challenging Robotic Memory Benchmark
by: Lei, Huashuo, et al.
Published: (2026)
by: Lei, Huashuo, et al.
Published: (2026)
Hierarchical Semantic-Augmented Navigation: Optimal Transport and Graph-Driven Reasoning for Vision-Language Navigation
by: Fang, Xiang, et al.
Published: (2026)
by: Fang, Xiang, et al.
Published: (2026)
VLN-Game: Vision-Language Equilibrium Search for Zero-Shot Semantic Navigation
by: Yu, Bangguo, et al.
Published: (2024)
by: Yu, Bangguo, et al.
Published: (2024)
Dream-SLAM: Dreaming the Unseen for Active SLAM in Dynamic Environments
by: Meng, Xiangqi, et al.
Published: (2026)
by: Meng, Xiangqi, et al.
Published: (2026)
CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding
by: Song, Wenxuan, et al.
Published: (2025)
by: Song, Wenxuan, et al.
Published: (2025)
Walk With Me: Long-Horizon Social Navigation for Human-Centric Outdoor Assistance
by: Zhang, Lingfeng, et al.
Published: (2026)
by: Zhang, Lingfeng, et al.
Published: (2026)
DecoVLN: Decoupling Observation, Reasoning, and Correction for Vision-and-Language Navigation
by: Xin, Zihao, et al.
Published: (2026)
by: Xin, Zihao, et al.
Published: (2026)
Enhancing Exploratory Capability of Visual Navigation Using Uncertainty of Implicit Scene Representation
by: Wang, Yichen, et al.
Published: (2024)
by: Wang, Yichen, et al.
Published: (2024)
Deployable Vision-driven UAV River Navigation via Human-in-the-loop Preference Alignment
by: Wang, Zihan, et al.
Published: (2025)
by: Wang, Zihan, et al.
Published: (2025)
Narrate2Nav: Real-Time Visual Navigation with Implicit Language Reasoning in Human-Centric Environments
by: Payandeh, Amirreza, et al.
Published: (2025)
by: Payandeh, Amirreza, et al.
Published: (2025)
Interactive Navigation for Legged Manipulators with Learned Arm-Pushing Controller
by: Bi, Zhihai, et al.
Published: (2025)
by: Bi, Zhihai, et al.
Published: (2025)
VLA-OPD: Bridging Offline SFT and Online RL for Vision-Language-Action Models via On-Policy Distillation
by: Zhong, Zhide, et al.
Published: (2026)
by: Zhong, Zhide, et al.
Published: (2026)
Seeing through Uncertainty: Robust Task-Oriented Optimization in Visual Navigation
by: Pan, Yiyuan, et al.
Published: (2025)
by: Pan, Yiyuan, et al.
Published: (2025)
AdaNav: Adaptive Reasoning with Uncertainty for Vision-Language Navigation
by: Ding, Xin, et al.
Published: (2025)
by: Ding, Xin, et al.
Published: (2025)
BEINGS: Bayesian Embodied Image-goal Navigation with Gaussian Splatting
by: Meng, Wugang, et al.
Published: (2024)
by: Meng, Wugang, et al.
Published: (2024)
Continually Evolving Skill Knowledge in Vision Language Action Model
by: Wu, Yuxuan, et al.
Published: (2025)
by: Wu, Yuxuan, et al.
Published: (2025)
Bridging the 2D-3D Gap: A Hierarchical Semantic-Geometric Map for Vision Language Navigation
by: Li, Kailing, et al.
Published: (2026)
by: Li, Kailing, et al.
Published: (2026)
Towards Autonomous Indoor Parking: A Globally Consistent Semantic SLAM System and A Semantic Localization Subsystem
by: Sha, Yichen, et al.
Published: (2024)
by: Sha, Yichen, et al.
Published: (2024)
Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model
by: Li, Fuhao, et al.
Published: (2025)
by: Li, Fuhao, et al.
Published: (2025)
HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models
by: Chen, Zixing, et al.
Published: (2026)
by: Chen, Zixing, et al.
Published: (2026)
VL-Nav: A Neuro-Symbolic Approach for Reasoning-based Vision-Language Navigation
by: Du, Yi, et al.
Published: (2025)
by: Du, Yi, et al.
Published: (2025)
ProFocus: Proactive Perception and Focused Reasoning in Vision-and-Language Navigation
by: Xue, Wei, et al.
Published: (2026)
by: Xue, Wei, et al.
Published: (2026)
MonoDream: Monocular Vision-Language Navigation with Panoramic Dreaming
by: Wang, Shuo, et al.
Published: (2025)
by: Wang, Shuo, et al.
Published: (2025)
Similar Items
-
P$^{3}$Nav: End-to-End Perception, Prediction and Planning for Vision-and-Language Navigation
by: Li, Tianfu, et al.
Published: (2026) -
Enhancing Vision-Language Navigation with Multimodal Event Knowledge from Real-World Indoor Tour Videos
by: Xu, Haoxuan, et al.
Published: (2026) -
FlowVLA: Visual Chain of Thought-based Motion Reasoning for Vision-Language-Action Models
by: Zhong, Zhide, et al.
Published: (2025) -
SEDualVLN: A Spatially-Enhanced Dual-System for Vision-Language Navigation
by: Huang, Jingzhi, et al.
Published: (2026) -
Progress-Think: Semantic Progress Reasoning for Vision-Language Navigation
by: Wang, Shuo, et al.
Published: (2025)