:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xia, Xingyu, Zhou, Lekai, Tang, Yujie, Zhu, Xiaozhou, Zhu, Hai, Yao, Wen
Format:	Preprint
Published:	2026
Subjects:	Robotics
Online Access:	https://arxiv.org/abs/2604.07705
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ImagineUAV: Aerial Vision-Language Navigation via World-Action Modeling and Kinodynamic Planning
by: Liu, Xuchen, et al.
Published: (2026)

VLA-AN: An Efficient and Onboard Vision-Language-Action Framework for Aerial Navigation in Complex Environments
by: Wu, Yuze, et al.
Published: (2025)

CLASH: Collaborative Large-Small Hierarchical Framework for Continuous Vision-and-Language Navigation
by: Wang, Liuyi, et al.
Published: (2025)

OpenVLN: Open-world Aerial Vision-Language Navigation
by: Lin, Peican, et al.
Published: (2025)

PANav: Toward Privacy-Aware Robot Navigation via Vision-Language Models
by: Yu, Bangguo, et al.
Published: (2024)

AgentVLN: Towards Agentic Vision-and-Language Navigation
by: Xin, Zihao, et al.
Published: (2026)

Modelling and Optimization of Magnetic Navigation Systems for Passive Robots in Minimally Invasive Brain Surgery
by: Xu Tang, et al.
Published: (2025)

TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
by: Wen, Junjie, et al.
Published: (2024)

Hierarchical Language Models for Semantic Navigation and Manipulation in an Aerial-Ground Robotic System
by: Liu, Haokun, et al.
Published: (2025)

OnFly: Onboard Zero-Shot Aerial Vision-Language Navigation toward Safety and Efficiency
by: Zheng, Guiyong, et al.
Published: (2026)

Online Robot Navigation and Manipulation with Distilled Vision-Language Models
by: Liu, Kangcheng
Published: (2024)

DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control
by: Wen, Junjie, et al.
Published: (2025)

WorldVLN: Autoregressive World Action Model for Aerial Vision-Language Navigation
by: Zhao, Baining, et al.
Published: (2026)

OpenFMNav: Towards Open-Set Zero-Shot Object Navigation via Vision-Language Foundation Models
by: Kuang, Yuxuan, et al.
Published: (2024)

HTNav: A Hybrid Navigation Framework with Tiered Structure for Urban Aerial Vision-and-Language Navigation
by: Fan, Chengjie, et al.
Published: (2026)

Resolving Positional Ambiguity in Dialogues by Vision-Language Models for Robot Navigation
by: Chen, Kuan-Lin, et al.
Published: (2024)

Exploring Spatial Representation to Enhance LLM Reasoning in Aerial Vision-Language Navigation
by: Gao, Yunpeng, et al.
Published: (2024)

History-Enhanced Two-Stage Transformer for Aerial Vision-and-Language Navigation
by: Ding, Xichen, et al.
Published: (2025)

Socially-Aware Robot Navigation Enhanced by Bidirectional Natural Language Conversations Using Large Language Models
by: Wen, Congcong, et al.
Published: (2024)

TIC-VLA: A Think-in-Control Vision-Language-Action Model for Robot Navigation in Dynamic Environments
by: Huang, Zhiyu, et al.
Published: (2026)

Cog-GA: A Large Language Models-based Generative Agent for Vision-Language Navigation in Continuous Environments
by: Li, Zhiyuan, et al.
Published: (2024)

Probing Prompt Design for Socially Compliant Robot Navigation with Vision Language Models
by: Xiao, Ling, et al.
Published: (2026)

Robot Navigation Using Physically Grounded Vision-Language Models in Outdoor Environments
by: Elnoor, Mohamed, et al.
Published: (2024)

Toward Embodiment Equivariant Vision-Language-Action Policy
by: Chen, Anzhe, et al.
Published: (2025)

AINav: Large Language Model-Based Adaptive Interactive Navigation
by: Zhou, Kangjie, et al.
Published: (2025)

ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model
by: Zhou, Zhongyi, et al.
Published: (2025)

LLM-BT: Performing Robotic Adaptive Tasks based on Large Language Models and Behavior Trees
by: Zhou, Haotian, et al.
Published: (2024)

HARP-VLA: Human-Robot Aligned Representation Learning for Vision-Language-Action Model
by: Zhu, Xiang, et al.
Published: (2026)

CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory
by: Zhang, Weichen, et al.
Published: (2025)

T-araVLN: Translator for Agricultural Robotic Agents on Vision-and-Language Navigation
by: Zhao, Xiaobei, et al.
Published: (2025)

Can Pretrained Vision-Language Embeddings Alone Guide Robot Navigation?
by: Subedi, Nitesh, et al.
Published: (2025)

OmniVLA: An Omni-Modal Vision-Language-Action Model for Robot Navigation
by: Hirose, Noriaki, et al.
Published: (2025)

Hey Robot! Personalizing Robot Navigation through Model Predictive Control with a Large Language Model
by: Martinez-Baselga, Diego, et al.
Published: (2024)

Safe-VLN: Collision Avoidance for Vision-and-Language Navigation of Autonomous Robots Operating in Continuous Environments
by: Yue, Lu, et al.
Published: (2023)

Interactive Navigation in Environments with Traversable Obstacles Using Large Language and Vision-Language Models
by: Zhang, Zhen, et al.
Published: (2023)

NaVILA: Legged Robot Vision-Language-Action Model for Navigation
by: Cheng, An-Chieh, et al.
Published: (2024)

LaViRA: Language-Vision-Robot Actions Translation for Zero-Shot Vision Language Navigation in Continuous Environments
by: Ding, Hongyu, et al.
Published: (2025)

RAVES-Calib: Robust, Accurate and Versatile Extrinsic Self Calibration Using Optimal Geometric Features
by: Zhang, Haoxin, et al.
Published: (2025)

Vision-Language-Action Models for Robotics: A Review Towards Real-World Applications
by: Kawaharazuka, Kento, et al.
Published: (2025)

ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
by: Zhao, Xinxin, et al.
Published: (2024)