_version_ 1866910992391208960
author Chen, Sheng
He, Peiyu
Hu, Jiaxin
Liu, Ziyang
Wang, Yansheng
Xu, Tao
Zhang, Chi
Zhang, Chongchong
An, Chao
Cai, Shiyu
Cao, Duo
Chen, Kangping
Chu, Shuai
Chu, Tianwei
Dan, Mingdi
Du, Min
Fang, Weiwei
Fu, Pengyou
Hu, Junkai
Jiang, Xiaowei
Jiang, Zhaodi
Li, Fuxuan
Li, Jun
Li, Minghui
Li, Mingyao
Li, Yanchang
Li, Zhibin
Liu, Guangming
Liu, Kairui
Liu, Lihao
Liu, Weizhi
Liu, Xiaoshun
Liu, Yufei
Liu, Yunfei
Lu, Qiang
Luo, Yuanfei
Lv, Xiang
Ma, Hongying
Ma, Sai
Mi, Lingxian
Sa, Sha
Shu, Hongxiang
Tian, Lei
Wang, Chengzhi
Wang, Jiayu
Wang, Kaijie
Wang, Qingyi
Wang, Renwen
Wang, Tao
Wang, Wei
Wang, Xirui
Wei, Chao
Wei, Xuguang
Xia, Zijun
Xiao, Zhaohao
Yan, Tingshuai
Yang, Liyan
Yang, Yifan
Yang, Zhikai
Yin, Zhong
Yuan, Li
Yuan, Liuchun
Zhang, Chi
Zhang, Jinyang
Zhang, Junhui
Zhang, Linge
Zhang, Zhenyi
Zhang, Zheyu
Zhu, Dongjie
Li, Hang
Zhang, Yangang
author_facet Chen, Sheng
He, Peiyu
Hu, Jiaxin
Liu, Ziyang
Wang, Yansheng
Xu, Tao
Zhang, Chi
Zhang, Chongchong
An, Chao
Cai, Shiyu
Cao, Duo
Chen, Kangping
Chu, Shuai
Chu, Tianwei
Dan, Mingdi
Du, Min
Fang, Weiwei
Fu, Pengyou
Hu, Junkai
Jiang, Xiaowei
Jiang, Zhaodi
Li, Fuxuan
Li, Jun
Li, Minghui
Li, Mingyao
Li, Yanchang
Li, Zhibin
Liu, Guangming
Liu, Kairui
Liu, Lihao
Liu, Weizhi
Liu, Xiaoshun
Liu, Yufei
Liu, Yunfei
Lu, Qiang
Luo, Yuanfei
Lv, Xiang
Ma, Hongying
Ma, Sai
Mi, Lingxian
Sa, Sha
Shu, Hongxiang
Tian, Lei
Wang, Chengzhi
Wang, Jiayu
Wang, Kaijie
Wang, Qingyi
Wang, Renwen
Wang, Tao
Wang, Wei
Wang, Xirui
Wei, Chao
Wei, Xuguang
Xia, Zijun
Xiao, Zhaohao
Yan, Tingshuai
Yang, Liyan
Yang, Yifan
Yang, Zhikai
Yin, Zhong
Yuan, Li
Yuan, Liuchun
Zhang, Chi
Zhang, Jinyang
Zhang, Junhui
Zhang, Linge
Zhang, Zhenyi
Zhang, Zheyu
Zhu, Dongjie
Li, Hang
Zhang, Yangang
contents Modern robot navigation systems encounter difficulties in diverse and complex indoor environments. Traditional approaches rely on multiple modules with small models or rule-based systems and thus lack adaptability to new environments. To address this, we developed Astra, a comprehensive dual-model architecture, Astra-Global and Astra-Local, for mobile robot navigation. Astra-Global, a multimodal LLM, processes vision and language inputs to perform self and goal localization using a hybrid topological-semantic graph as the global map, and outperforms traditional visual place recognition methods. Astra-Local, a multitask network, handles local path planning and odometry estimation. Its 4D spatial-temporal encoder, trained through self-supervised learning, generates robust 4D features for downstream tasks. The planning head utilizes flow matching and a novel masked ESDF loss to minimize collision risks for generating local trajectories, and the odometry head integrates multi-sensor inputs via a transformer encoder to predict the relative pose of the robot. Deployed on real in-house mobile robots, Astra achieves high end-to-end mission success rate across diverse indoor environments.
format Preprint
id arxiv_https___arxiv_org_abs_2506_06205
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning
Chen, Sheng
He, Peiyu
Hu, Jiaxin
Liu, Ziyang
Wang, Yansheng
Xu, Tao
Zhang, Chi
Zhang, Chongchong
An, Chao
Cai, Shiyu
Cao, Duo
Chen, Kangping
Chu, Shuai
Chu, Tianwei
Dan, Mingdi
Du, Min
Fang, Weiwei
Fu, Pengyou
Hu, Junkai
Jiang, Xiaowei
Jiang, Zhaodi
Li, Fuxuan
Li, Jun
Li, Minghui
Li, Mingyao
Li, Yanchang
Li, Zhibin
Liu, Guangming
Liu, Kairui
Liu, Lihao
Liu, Weizhi
Liu, Xiaoshun
Liu, Yufei
Liu, Yunfei
Lu, Qiang
Luo, Yuanfei
Lv, Xiang
Ma, Hongying
Ma, Sai
Mi, Lingxian
Sa, Sha
Shu, Hongxiang
Tian, Lei
Wang, Chengzhi
Wang, Jiayu
Wang, Kaijie
Wang, Qingyi
Wang, Renwen
Wang, Tao
Wang, Wei
Wang, Xirui
Wei, Chao
Wei, Xuguang
Xia, Zijun
Xiao, Zhaohao
Yan, Tingshuai
Yang, Liyan
Yang, Yifan
Yang, Zhikai
Yin, Zhong
Yuan, Li
Yuan, Liuchun
Zhang, Chi
Zhang, Jinyang
Zhang, Junhui
Zhang, Linge
Zhang, Zhenyi
Zhang, Zheyu
Zhu, Dongjie
Li, Hang
Zhang, Yangang
Robotics
Artificial Intelligence
Modern robot navigation systems encounter difficulties in diverse and complex indoor environments. Traditional approaches rely on multiple modules with small models or rule-based systems and thus lack adaptability to new environments. To address this, we developed Astra, a comprehensive dual-model architecture, Astra-Global and Astra-Local, for mobile robot navigation. Astra-Global, a multimodal LLM, processes vision and language inputs to perform self and goal localization using a hybrid topological-semantic graph as the global map, and outperforms traditional visual place recognition methods. Astra-Local, a multitask network, handles local path planning and odometry estimation. Its 4D spatial-temporal encoder, trained through self-supervised learning, generates robust 4D features for downstream tasks. The planning head utilizes flow matching and a novel masked ESDF loss to minimize collision risks for generating local trajectories, and the odometry head integrates multi-sensor inputs via a transformer encoder to predict the relative pose of the robot. Deployed on real in-house mobile robots, Astra achieves high end-to-end mission success rate across diverse indoor environments.
title Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning
topic Robotics
Artificial Intelligence
url https://arxiv.org/abs/2506.06205