Saved in:
Bibliographic Details
Main Authors: Huang, Shibo, Shi, Chenfan, Yang, Jian, Dong, Hanlin, Mi, Jinpeng, Li, Ke, Zhang, Jianfeng, Ding, Miao, Liang, Peidong, You, Xiong, Wei, Xian
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.08330
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909534269734912
author Huang, Shibo
Shi, Chenfan
Yang, Jian
Dong, Hanlin
Mi, Jinpeng
Li, Ke
Zhang, Jianfeng
Ding, Miao
Liang, Peidong
You, Xiong
Wei, Xian
author_facet Huang, Shibo
Shi, Chenfan
Yang, Jian
Dong, Hanlin
Mi, Jinpeng
Li, Ke
Zhang, Jianfeng
Ding, Miao
Liang, Peidong
You, Xiong
Wei, Xian
contents Autonomous navigation in open-world outdoor environments faces challenges in integrating dynamic conditions, long-distance spatial reasoning, and semantic understanding. Traditional methods struggle to balance local planning, global planning, and semantic task execution, while existing large language models (LLMs) enhance semantic comprehension but lack spatial reasoning capabilities. Although diffusion models excel in local optimization, they fall short in large-scale long-distance navigation. To address these gaps, this paper proposes KiteRunner, a language-driven cooperative local-global navigation strategy that combines UAV orthophoto-based global planning with diffusion model-driven local path generation for long-distance navigation in open-world scenarios. Our method innovatively leverages real-time UAV orthophotography to construct a global probability map, providing traversability guidance for the local planner, while integrating large models like CLIP and GPT to interpret natural language instructions. Experiments demonstrate that KiteRunner achieves 5.6% and 12.8% improvements in path efficiency over state-of-the-art methods in structured and unstructured environments, respectively, with significant reductions in human interventions and execution time.
format Preprint
id arxiv_https___arxiv_org_abs_2503_08330
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle KiteRunner: Language-Driven Cooperative Local-Global Navigation Policy with UAV Mapping in Outdoor Environments
Huang, Shibo
Shi, Chenfan
Yang, Jian
Dong, Hanlin
Mi, Jinpeng
Li, Ke
Zhang, Jianfeng
Ding, Miao
Liang, Peidong
You, Xiong
Wei, Xian
Robotics
Autonomous navigation in open-world outdoor environments faces challenges in integrating dynamic conditions, long-distance spatial reasoning, and semantic understanding. Traditional methods struggle to balance local planning, global planning, and semantic task execution, while existing large language models (LLMs) enhance semantic comprehension but lack spatial reasoning capabilities. Although diffusion models excel in local optimization, they fall short in large-scale long-distance navigation. To address these gaps, this paper proposes KiteRunner, a language-driven cooperative local-global navigation strategy that combines UAV orthophoto-based global planning with diffusion model-driven local path generation for long-distance navigation in open-world scenarios. Our method innovatively leverages real-time UAV orthophotography to construct a global probability map, providing traversability guidance for the local planner, while integrating large models like CLIP and GPT to interpret natural language instructions. Experiments demonstrate that KiteRunner achieves 5.6% and 12.8% improvements in path efficiency over state-of-the-art methods in structured and unstructured environments, respectively, with significant reductions in human interventions and execution time.
title KiteRunner: Language-Driven Cooperative Local-Global Navigation Policy with UAV Mapping in Outdoor Environments
topic Robotics
url https://arxiv.org/abs/2503.08330