Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.23599 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866915820744998912 |
|---|---|
| author | Zhou, Yuchen Luo, Yan Wang, Xiaogang Gu, Xingjian Lu, Mingzhou Shu, Xiangbo |
| author_facet | Zhou, Yuchen Luo, Yan Wang, Xiaogang Gu, Xingjian Lu, Mingzhou Shu, Xiangbo |
| contents | Efficient and high-accuracy 3D occupancy prediction is vital for the performance of autonomous driving systems. However, existing methods struggle to balance precision and efficiency: high-accuracy approaches are often hindered by heavy computational overhead, leading to slow inference speeds, while others leverage pure bird's-eye-view (BEV) representations to gain speed at the cost of losing vertical spatial cues and compromising geometric integrity. To overcome these limitations, we build on the efficient Lift-Splat-Shoot (LSS) paradigm and propose a pure 2D framework, DA-Occ, for 3D occupancy prediction that preserves fine-grained geometry. Standard LSS-based methods lift 2D features into 3D space solely based on depth scores, making it difficult to fully capture vertical structure. To improve upon this, DA-Occ augments depth-based lifting with a complementary height-score projection that explicitly encodes vertical geometric information. We further employ direction-aware convolution to extract geometric features along both vertical and horizontal orientations, effectively balancing accuracy and computational efficiency. On the Occ3D-nuScenes, the proposed method achieves an mIoU of 39.3% and an inference speed of 27.7 FPS, effectively balancing accuracy and efficiency. In simulations on edge devices, the inference speed reaches 14.8 FPS, further demonstrating the method's applicability for real-time deployment in resource-constrained environments. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2507_23599 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | DA-Occ: Direction-Aware 2D Convolution for Efficient and Geometry-Preserving 3D Occupancy Prediction in Autonomous Driving Zhou, Yuchen Luo, Yan Wang, Xiaogang Gu, Xingjian Lu, Mingzhou Shu, Xiangbo Computer Vision and Pattern Recognition Efficient and high-accuracy 3D occupancy prediction is vital for the performance of autonomous driving systems. However, existing methods struggle to balance precision and efficiency: high-accuracy approaches are often hindered by heavy computational overhead, leading to slow inference speeds, while others leverage pure bird's-eye-view (BEV) representations to gain speed at the cost of losing vertical spatial cues and compromising geometric integrity. To overcome these limitations, we build on the efficient Lift-Splat-Shoot (LSS) paradigm and propose a pure 2D framework, DA-Occ, for 3D occupancy prediction that preserves fine-grained geometry. Standard LSS-based methods lift 2D features into 3D space solely based on depth scores, making it difficult to fully capture vertical structure. To improve upon this, DA-Occ augments depth-based lifting with a complementary height-score projection that explicitly encodes vertical geometric information. We further employ direction-aware convolution to extract geometric features along both vertical and horizontal orientations, effectively balancing accuracy and computational efficiency. On the Occ3D-nuScenes, the proposed method achieves an mIoU of 39.3% and an inference speed of 27.7 FPS, effectively balancing accuracy and efficiency. In simulations on edge devices, the inference speed reaches 14.8 FPS, further demonstrating the method's applicability for real-time deployment in resource-constrained environments. |
| title | DA-Occ: Direction-Aware 2D Convolution for Efficient and Geometry-Preserving 3D Occupancy Prediction in Autonomous Driving |
| topic | Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2507.23599 |