Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yang, Lei, Zhang, Xinyu, Li, Jun, Wang, Li, Zhang, Chuang, Ju, Li, Li, Zhiwei, Shen, Yang
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2401.16110
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916198191464448
author	Yang, Lei Zhang, Xinyu Li, Jun Wang, Li Zhang, Chuang Ju, Li Li, Zhiwei Shen, Yang
author_facet	Yang, Lei Zhang, Xinyu Li, Jun Wang, Li Zhang, Chuang Ju, Li Li, Zhiwei Shen, Yang
contents	Roadside perception can greatly increase the safety of autonomous vehicles by extending their perception ability beyond the visual range and addressing blind spots. However, current state-of-the-art vision-based roadside detection methods possess high accuracy on labeled scenes but have inferior performance on new scenes. This is because roadside cameras remain stationary after installation and can only collect data from a single scene, resulting in the algorithm overfitting these roadside backgrounds and camera poses. To address this issue, in this paper, we propose an innovative Scenario Generalization Framework for Vision-based Roadside 3D Object Detection, dubbed SGV3D. Specifically, we employ a Background-suppressed Module (BSM) to mitigate background overfitting in vision-centric pipelines by attenuating background features during the 2D to bird's-eye-view projection. Furthermore, by introducing the Semi-supervised Data Generation Pipeline (SSDG) using unlabeled images from new scenes, diverse instance foregrounds with varying camera poses are generated, addressing the risk of overfitting specific camera poses. We evaluate our method on two large-scale roadside benchmarks. Our method surpasses all previous methods by a significant margin in new scenes, including +42.57% for vehicle, +5.87% for pedestrian, and +14.89% for cyclist compared to BEVHeight on the DAIR-V2X-I heterologous benchmark. On the larger-scale Rope3D heterologous benchmark, we achieve notable gains of 14.48% for car and 12.41% for large vehicle. We aspire to contribute insights on the exploration of roadside perception techniques, emphasizing their capability for scenario generalization. The code will be available at https://github.com/yanglei18/SGV3D
format	Preprint
id	arxiv_https___arxiv_org_abs_2401_16110
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	SGV3D:Towards Scenario Generalization for Vision-based Roadside 3D Object Detection Yang, Lei Zhang, Xinyu Li, Jun Wang, Li Zhang, Chuang Ju, Li Li, Zhiwei Shen, Yang Computer Vision and Pattern Recognition Roadside perception can greatly increase the safety of autonomous vehicles by extending their perception ability beyond the visual range and addressing blind spots. However, current state-of-the-art vision-based roadside detection methods possess high accuracy on labeled scenes but have inferior performance on new scenes. This is because roadside cameras remain stationary after installation and can only collect data from a single scene, resulting in the algorithm overfitting these roadside backgrounds and camera poses. To address this issue, in this paper, we propose an innovative Scenario Generalization Framework for Vision-based Roadside 3D Object Detection, dubbed SGV3D. Specifically, we employ a Background-suppressed Module (BSM) to mitigate background overfitting in vision-centric pipelines by attenuating background features during the 2D to bird's-eye-view projection. Furthermore, by introducing the Semi-supervised Data Generation Pipeline (SSDG) using unlabeled images from new scenes, diverse instance foregrounds with varying camera poses are generated, addressing the risk of overfitting specific camera poses. We evaluate our method on two large-scale roadside benchmarks. Our method surpasses all previous methods by a significant margin in new scenes, including +42.57% for vehicle, +5.87% for pedestrian, and +14.89% for cyclist compared to BEVHeight on the DAIR-V2X-I heterologous benchmark. On the larger-scale Rope3D heterologous benchmark, we achieve notable gains of 14.48% for car and 12.41% for large vehicle. We aspire to contribute insights on the exploration of roadside perception techniques, emphasizing their capability for scenario generalization. The code will be available at https://github.com/yanglei18/SGV3D
title	SGV3D:Towards Scenario Generalization for Vision-based Roadside 3D Object Detection
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2401.16110

Similar Items