Saved in:
| Main Authors: | , , , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.02415 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866912566965436416 |
|---|---|
| author | Wei, Xiaobao Shu, Changyong Yue, Zhaokun Huang, Chang Liu, Weiwei Yang, Shuai Yang, Lirong Gao, Peng Zhang, Wenbin Zhu, Gaochao Wang, Chengxiang |
| author_facet | Wei, Xiaobao Shu, Changyong Yue, Zhaokun Huang, Chang Liu, Weiwei Yang, Shuai Yang, Lirong Gao, Peng Zhang, Wenbin Zhu, Gaochao Wang, Chengxiang |
| contents | High-performance real-time stereo matching methods invariably rely on 3D regularization of the cost volume, which is unfriendly to mobile devices. And 2D regularization based methods struggle in ill-posed regions. In this paper, we present a deployment-friendly 4D cost aggregation network DBStereo, which is based on pure 2D convolutions. Specifically, we first provide a thorough analysis of the decoupling characteristics of 4D cost volume. And design a lightweight bidirectional geometry aggregation block to capture spatial and disparity representation respectively. Through decoupled learning, our approach achieves real-time performance and impressive accuracy simultaneously. Extensive experiments demonstrate that our proposed DBStereo outperforms all existing aggregation-based methods in both inference time and accuracy, even surpassing the iterative-based method IGEV-Stereo. Our study break the empirical design of using 3D convolutions for 4D cost volume and provides a simple yet strong baseline of the proposed decouple aggregation paradigm for further study. Code will be available at (\href{https://github.com/happydummy/DBStereo}{https://github.com/happydummy/DBStereo}) soon. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2509_02415 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Decoupling Bidirectional Geometric Representations of 4D cost volume with 2D convolution Wei, Xiaobao Shu, Changyong Yue, Zhaokun Huang, Chang Liu, Weiwei Yang, Shuai Yang, Lirong Gao, Peng Zhang, Wenbin Zhu, Gaochao Wang, Chengxiang Computer Vision and Pattern Recognition High-performance real-time stereo matching methods invariably rely on 3D regularization of the cost volume, which is unfriendly to mobile devices. And 2D regularization based methods struggle in ill-posed regions. In this paper, we present a deployment-friendly 4D cost aggregation network DBStereo, which is based on pure 2D convolutions. Specifically, we first provide a thorough analysis of the decoupling characteristics of 4D cost volume. And design a lightweight bidirectional geometry aggregation block to capture spatial and disparity representation respectively. Through decoupled learning, our approach achieves real-time performance and impressive accuracy simultaneously. Extensive experiments demonstrate that our proposed DBStereo outperforms all existing aggregation-based methods in both inference time and accuracy, even surpassing the iterative-based method IGEV-Stereo. Our study break the empirical design of using 3D convolutions for 4D cost volume and provides a simple yet strong baseline of the proposed decouple aggregation paradigm for further study. Code will be available at (\href{https://github.com/happydummy/DBStereo}{https://github.com/happydummy/DBStereo}) soon. |
| title | Decoupling Bidirectional Geometric Representations of 4D cost volume with 2D convolution |
| topic | Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2509.02415 |