Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Bispo, Ruan, Mitrev, Dane, Mariotti, Letizia, Botty, Clément, Humphrey, Denver, Scanlan, Anthony, Eising, Ciarán
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2509.15935
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915527229702144
author	Bispo, Ruan Mitrev, Dane Mariotti, Letizia Botty, Clément Humphrey, Denver Scanlan, Anthony Eising, Ciarán
author_facet	Bispo, Ruan Mitrev, Dane Mariotti, Letizia Botty, Clément Humphrey, Denver Scanlan, Anthony Eising, Ciarán
contents	Camera-radar fusion offers a robust and low-cost alternative to Camera-lidar fusion for the 3D object detection task in real-time under adverse weather and lighting conditions. However, currently, in the literature, it is possible to find few works focusing on this modality and, most importantly, developing new architectures to explore the advantages of the radar point cloud, such as accurate distance estimation and speed information. Therefore, this work presents a novel and efficient 3D object detection algorithm using cameras and radars in the bird's-eye-view (BEV). Our algorithm exploits the advantages of radar before fusing the features into a detection head. A new backbone is introduced, which maps the radar pillar features into an embedded dimension. A self-attention mechanism allows the backbone to model the dependencies between the radar points. We are using a simplified convolutional layer to replace the FPN-based convolutional layers used in the PointPillars-based architectures with the main goal of reducing inference time. Our results show that with this modification, our approach achieves the new state-of-the-art in the 3D object detection problem, reaching 58.2 of the NDS metric for the use of ResNet-50, while also setting a new benchmark for inference time on the nuScenes dataset for the same category.
format	Preprint
id	arxiv_https___arxiv_org_abs_2509_15935
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	PAN: Pillars-Attention-Based Network for 3D Object Detection Bispo, Ruan Mitrev, Dane Mariotti, Letizia Botty, Clément Humphrey, Denver Scanlan, Anthony Eising, Ciarán Computer Vision and Pattern Recognition Camera-radar fusion offers a robust and low-cost alternative to Camera-lidar fusion for the 3D object detection task in real-time under adverse weather and lighting conditions. However, currently, in the literature, it is possible to find few works focusing on this modality and, most importantly, developing new architectures to explore the advantages of the radar point cloud, such as accurate distance estimation and speed information. Therefore, this work presents a novel and efficient 3D object detection algorithm using cameras and radars in the bird's-eye-view (BEV). Our algorithm exploits the advantages of radar before fusing the features into a detection head. A new backbone is introduced, which maps the radar pillar features into an embedded dimension. A self-attention mechanism allows the backbone to model the dependencies between the radar points. We are using a simplified convolutional layer to replace the FPN-based convolutional layers used in the PointPillars-based architectures with the main goal of reducing inference time. Our results show that with this modification, our approach achieves the new state-of-the-art in the 3D object detection problem, reaching 58.2 of the NDS metric for the use of ResNet-50, while also setting a new benchmark for inference time on the nuScenes dataset for the same category.
title	PAN: Pillars-Attention-Based Network for 3D Object Detection
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2509.15935

Similar Items