Saved in:
Bibliographic Details
Main Authors: Dong, Yifei, Wu, Fengyi, He, Qi, Cheng, Zhi-Qi, Li, Heng, Li, Minghan, Cheng, Zebang, Zhou, Yuxuan, Sun, Jingdong, Dai, Qi, Hauptmann, Alexander G
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.14229
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908585372418048
author Dong, Yifei
Wu, Fengyi
He, Qi
Cheng, Zhi-Qi
Li, Heng
Li, Minghan
Cheng, Zebang
Zhou, Yuxuan
Sun, Jingdong
Dai, Qi
Hauptmann, Alexander G
author_facet Dong, Yifei
Wu, Fengyi
He, Qi
Cheng, Zhi-Qi
Li, Heng
Li, Minghan
Cheng, Zebang
Zhou, Yuxuan
Sun, Jingdong
Dai, Qi
Hauptmann, Alexander G
contents Vision-and-Language Navigation (VLN) has been studied mainly in either discrete or continuous settings, with little attention to dynamic, crowded environments. We present HA-VLN 2.0, a unified benchmark introducing explicit social-awareness constraints. Our contributions are: (i) a standardized task and metrics capturing both goal accuracy and personal-space adherence; (ii) HAPS 2.0 dataset and simulators modeling multi-human interactions, outdoor contexts, and finer language-motion alignment; (iii) benchmarks on 16,844 socially grounded instructions, revealing sharp performance drops of leading agents under human dynamics and partial observability; and (iv) real-world robot experiments validating sim-to-real transfer, with an open leaderboard enabling transparent comparison. Results show that explicit social modeling improves navigation robustness and reduces collisions, underscoring the necessity of human-centric approaches. By releasing datasets, simulators, baselines, and protocols, HA-VLN 2.0 provides a strong foundation for safe, socially responsible navigation research.
format Preprint
id arxiv_https___arxiv_org_abs_2503_14229
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions
Dong, Yifei
Wu, Fengyi
He, Qi
Cheng, Zhi-Qi
Li, Heng
Li, Minghan
Cheng, Zebang
Zhou, Yuxuan
Sun, Jingdong
Dai, Qi
Hauptmann, Alexander G
Artificial Intelligence
Computer Vision and Pattern Recognition
Robotics
Vision-and-Language Navigation (VLN) has been studied mainly in either discrete or continuous settings, with little attention to dynamic, crowded environments. We present HA-VLN 2.0, a unified benchmark introducing explicit social-awareness constraints. Our contributions are: (i) a standardized task and metrics capturing both goal accuracy and personal-space adherence; (ii) HAPS 2.0 dataset and simulators modeling multi-human interactions, outdoor contexts, and finer language-motion alignment; (iii) benchmarks on 16,844 socially grounded instructions, revealing sharp performance drops of leading agents under human dynamics and partial observability; and (iv) real-world robot experiments validating sim-to-real transfer, with an open leaderboard enabling transparent comparison. Results show that explicit social modeling improves navigation robustness and reduces collisions, underscoring the necessity of human-centric approaches. By releasing datasets, simulators, baselines, and protocols, HA-VLN 2.0 provides a strong foundation for safe, socially responsible navigation research.
title HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions
topic Artificial Intelligence
Computer Vision and Pattern Recognition
Robotics
url https://arxiv.org/abs/2503.14229