Saved in:
| Main Authors: | , , , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.14229 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866908585372418048 |
|---|---|
| author | Dong, Yifei Wu, Fengyi He, Qi Cheng, Zhi-Qi Li, Heng Li, Minghan Cheng, Zebang Zhou, Yuxuan Sun, Jingdong Dai, Qi Hauptmann, Alexander G |
| author_facet | Dong, Yifei Wu, Fengyi He, Qi Cheng, Zhi-Qi Li, Heng Li, Minghan Cheng, Zebang Zhou, Yuxuan Sun, Jingdong Dai, Qi Hauptmann, Alexander G |
| contents | Vision-and-Language Navigation (VLN) has been studied mainly in either discrete or continuous settings, with little attention to dynamic, crowded environments. We present HA-VLN 2.0, a unified benchmark introducing explicit social-awareness constraints. Our contributions are: (i) a standardized task and metrics capturing both goal accuracy and personal-space adherence; (ii) HAPS 2.0 dataset and simulators modeling multi-human interactions, outdoor contexts, and finer language-motion alignment; (iii) benchmarks on 16,844 socially grounded instructions, revealing sharp performance drops of leading agents under human dynamics and partial observability; and (iv) real-world robot experiments validating sim-to-real transfer, with an open leaderboard enabling transparent comparison. Results show that explicit social modeling improves navigation robustness and reduces collisions, underscoring the necessity of human-centric approaches. By releasing datasets, simulators, baselines, and protocols, HA-VLN 2.0 provides a strong foundation for safe, socially responsible navigation research. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2503_14229 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions Dong, Yifei Wu, Fengyi He, Qi Cheng, Zhi-Qi Li, Heng Li, Minghan Cheng, Zebang Zhou, Yuxuan Sun, Jingdong Dai, Qi Hauptmann, Alexander G Artificial Intelligence Computer Vision and Pattern Recognition Robotics Vision-and-Language Navigation (VLN) has been studied mainly in either discrete or continuous settings, with little attention to dynamic, crowded environments. We present HA-VLN 2.0, a unified benchmark introducing explicit social-awareness constraints. Our contributions are: (i) a standardized task and metrics capturing both goal accuracy and personal-space adherence; (ii) HAPS 2.0 dataset and simulators modeling multi-human interactions, outdoor contexts, and finer language-motion alignment; (iii) benchmarks on 16,844 socially grounded instructions, revealing sharp performance drops of leading agents under human dynamics and partial observability; and (iv) real-world robot experiments validating sim-to-real transfer, with an open leaderboard enabling transparent comparison. Results show that explicit social modeling improves navigation robustness and reduces collisions, underscoring the necessity of human-centric approaches. By releasing datasets, simulators, baselines, and protocols, HA-VLN 2.0 provides a strong foundation for safe, socially responsible navigation research. |
| title | HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions |
| topic | Artificial Intelligence Computer Vision and Pattern Recognition Robotics |
| url | https://arxiv.org/abs/2503.14229 |