Saved in:
Bibliographic Details
Main Authors: Jiang, Jiahao, Yang, Yuxiang, Deng, Yingqi, Ma, Chenlong, Zhang, Jing
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.01646
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916379855159296
author Jiang, Jiahao
Yang, Yuxiang
Deng, Yingqi
Ma, Chenlong
Zhang, Jing
author_facet Jiang, Jiahao
Yang, Yuxiang
Deng, Yingqi
Ma, Chenlong
Zhang, Jing
contents Goal-driven mobile robot navigation in map-less environments requires effective state representations for reliable decision-making. Inspired by the favorable properties of Bird's-Eye View (BEV) in point clouds for visual perception, this paper introduces a novel navigation approach named BEVNav. It employs deep reinforcement learning to learn BEV representations and enhance decision-making reliability. First, we propose a self-supervised spatial-temporal contrastive learning approach to learn BEV representations. Spatially, two randomly augmented views from a point cloud predict each other, enhancing spatial features. Temporally, we combine the current observation with consecutive frames' actions to predict future features, establishing the relationship between observation transitions and actions to capture temporal cues. Then, incorporating this spatial-temporal contrastive learning in the Soft Actor-Critic reinforcement learning framework, our BEVNav offers a superior navigation policy. Extensive experiments demonstrate BEVNav's robustness in environments with dense pedestrians, outperforming state-of-the-art methods across multiple benchmarks. \rev{The code will be made publicly available at https://github.com/LanrenzzzZ/BEVNav.
format Preprint
id arxiv_https___arxiv_org_abs_2409_01646
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle BEVNav: Robot Autonomous Navigation Via Spatial-Temporal Contrastive Learning in Bird's-Eye View
Jiang, Jiahao
Yang, Yuxiang
Deng, Yingqi
Ma, Chenlong
Zhang, Jing
Robotics
Goal-driven mobile robot navigation in map-less environments requires effective state representations for reliable decision-making. Inspired by the favorable properties of Bird's-Eye View (BEV) in point clouds for visual perception, this paper introduces a novel navigation approach named BEVNav. It employs deep reinforcement learning to learn BEV representations and enhance decision-making reliability. First, we propose a self-supervised spatial-temporal contrastive learning approach to learn BEV representations. Spatially, two randomly augmented views from a point cloud predict each other, enhancing spatial features. Temporally, we combine the current observation with consecutive frames' actions to predict future features, establishing the relationship between observation transitions and actions to capture temporal cues. Then, incorporating this spatial-temporal contrastive learning in the Soft Actor-Critic reinforcement learning framework, our BEVNav offers a superior navigation policy. Extensive experiments demonstrate BEVNav's robustness in environments with dense pedestrians, outperforming state-of-the-art methods across multiple benchmarks. \rev{The code will be made publicly available at https://github.com/LanrenzzzZ/BEVNav.
title BEVNav: Robot Autonomous Navigation Via Spatial-Temporal Contrastive Learning in Bird's-Eye View
topic Robotics
url https://arxiv.org/abs/2409.01646