Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Bochare, Anupkumar
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2505.06113
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909606050004992
author	Bochare, Anupkumar
author_facet	Bochare, Anupkumar
contents	Autonomous vehicle perception systems have traditionally relied on costly LiDAR sensors to generate precise environmental representations. In this paper, we propose a camera-only perception framework that produces Bird's Eye View (BEV) maps by extending the Lift-Splat-Shoot architecture. Our method combines YOLOv11-based object detection with DepthAnythingV2 monocular depth estimation across multi-camera inputs to achieve comprehensive 360-degree scene understanding. We evaluate our approach on the OpenLane-V2 and NuScenes datasets, achieving up to 85% road segmentation accuracy and 85-90% vehicle detection rates when compared against LiDAR ground truth, with average positional errors limited to 1.2 meters. These results highlight the potential of deep learning to extract rich spatial information using only camera inputs, enabling cost-efficient autonomous navigation without sacrificing accuracy.
format	Preprint
id	arxiv_https___arxiv_org_abs_2505_06113
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Camera-Only Bird's Eye View Perception: A Neural Approach to LiDAR-Free Environmental Mapping for Autonomous Vehicles Bochare, Anupkumar Computer Vision and Pattern Recognition Autonomous vehicle perception systems have traditionally relied on costly LiDAR sensors to generate precise environmental representations. In this paper, we propose a camera-only perception framework that produces Bird's Eye View (BEV) maps by extending the Lift-Splat-Shoot architecture. Our method combines YOLOv11-based object detection with DepthAnythingV2 monocular depth estimation across multi-camera inputs to achieve comprehensive 360-degree scene understanding. We evaluate our approach on the OpenLane-V2 and NuScenes datasets, achieving up to 85% road segmentation accuracy and 85-90% vehicle detection rates when compared against LiDAR ground truth, with average positional errors limited to 1.2 meters. These results highlight the potential of deep learning to extract rich spatial information using only camera inputs, enabling cost-efficient autonomous navigation without sacrificing accuracy.
title	Camera-Only Bird's Eye View Perception: A Neural Approach to LiDAR-Free Environmental Mapping for Autonomous Vehicles
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2505.06113

Similar Items