Saved in:
Bibliographic Details
Main Authors: Zhou, Yuchen, Luo, Yan, Wang, Xiaogang, Gu, Xingjian, Lu, Mingzhou, Shu, Xiangbo
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.23599
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915820744998912
author Zhou, Yuchen
Luo, Yan
Wang, Xiaogang
Gu, Xingjian
Lu, Mingzhou
Shu, Xiangbo
author_facet Zhou, Yuchen
Luo, Yan
Wang, Xiaogang
Gu, Xingjian
Lu, Mingzhou
Shu, Xiangbo
contents Efficient and high-accuracy 3D occupancy prediction is vital for the performance of autonomous driving systems. However, existing methods struggle to balance precision and efficiency: high-accuracy approaches are often hindered by heavy computational overhead, leading to slow inference speeds, while others leverage pure bird's-eye-view (BEV) representations to gain speed at the cost of losing vertical spatial cues and compromising geometric integrity. To overcome these limitations, we build on the efficient Lift-Splat-Shoot (LSS) paradigm and propose a pure 2D framework, DA-Occ, for 3D occupancy prediction that preserves fine-grained geometry. Standard LSS-based methods lift 2D features into 3D space solely based on depth scores, making it difficult to fully capture vertical structure. To improve upon this, DA-Occ augments depth-based lifting with a complementary height-score projection that explicitly encodes vertical geometric information. We further employ direction-aware convolution to extract geometric features along both vertical and horizontal orientations, effectively balancing accuracy and computational efficiency. On the Occ3D-nuScenes, the proposed method achieves an mIoU of 39.3% and an inference speed of 27.7 FPS, effectively balancing accuracy and efficiency. In simulations on edge devices, the inference speed reaches 14.8 FPS, further demonstrating the method's applicability for real-time deployment in resource-constrained environments.
format Preprint
id arxiv_https___arxiv_org_abs_2507_23599
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle DA-Occ: Direction-Aware 2D Convolution for Efficient and Geometry-Preserving 3D Occupancy Prediction in Autonomous Driving
Zhou, Yuchen
Luo, Yan
Wang, Xiaogang
Gu, Xingjian
Lu, Mingzhou
Shu, Xiangbo
Computer Vision and Pattern Recognition
Efficient and high-accuracy 3D occupancy prediction is vital for the performance of autonomous driving systems. However, existing methods struggle to balance precision and efficiency: high-accuracy approaches are often hindered by heavy computational overhead, leading to slow inference speeds, while others leverage pure bird's-eye-view (BEV) representations to gain speed at the cost of losing vertical spatial cues and compromising geometric integrity. To overcome these limitations, we build on the efficient Lift-Splat-Shoot (LSS) paradigm and propose a pure 2D framework, DA-Occ, for 3D occupancy prediction that preserves fine-grained geometry. Standard LSS-based methods lift 2D features into 3D space solely based on depth scores, making it difficult to fully capture vertical structure. To improve upon this, DA-Occ augments depth-based lifting with a complementary height-score projection that explicitly encodes vertical geometric information. We further employ direction-aware convolution to extract geometric features along both vertical and horizontal orientations, effectively balancing accuracy and computational efficiency. On the Occ3D-nuScenes, the proposed method achieves an mIoU of 39.3% and an inference speed of 27.7 FPS, effectively balancing accuracy and efficiency. In simulations on edge devices, the inference speed reaches 14.8 FPS, further demonstrating the method's applicability for real-time deployment in resource-constrained environments.
title DA-Occ: Direction-Aware 2D Convolution for Efficient and Geometry-Preserving 3D Occupancy Prediction in Autonomous Driving
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2507.23599