Saved in:
Bibliographic Details
Main Authors: Wei, Xiaobao, Shu, Changyong, Yue, Zhaokun, Huang, Chang, Liu, Weiwei, Yang, Shuai, Yang, Lirong, Gao, Peng, Zhang, Wenbin, Zhu, Gaochao, Wang, Chengxiang
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.02415
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912566965436416
author Wei, Xiaobao
Shu, Changyong
Yue, Zhaokun
Huang, Chang
Liu, Weiwei
Yang, Shuai
Yang, Lirong
Gao, Peng
Zhang, Wenbin
Zhu, Gaochao
Wang, Chengxiang
author_facet Wei, Xiaobao
Shu, Changyong
Yue, Zhaokun
Huang, Chang
Liu, Weiwei
Yang, Shuai
Yang, Lirong
Gao, Peng
Zhang, Wenbin
Zhu, Gaochao
Wang, Chengxiang
contents High-performance real-time stereo matching methods invariably rely on 3D regularization of the cost volume, which is unfriendly to mobile devices. And 2D regularization based methods struggle in ill-posed regions. In this paper, we present a deployment-friendly 4D cost aggregation network DBStereo, which is based on pure 2D convolutions. Specifically, we first provide a thorough analysis of the decoupling characteristics of 4D cost volume. And design a lightweight bidirectional geometry aggregation block to capture spatial and disparity representation respectively. Through decoupled learning, our approach achieves real-time performance and impressive accuracy simultaneously. Extensive experiments demonstrate that our proposed DBStereo outperforms all existing aggregation-based methods in both inference time and accuracy, even surpassing the iterative-based method IGEV-Stereo. Our study break the empirical design of using 3D convolutions for 4D cost volume and provides a simple yet strong baseline of the proposed decouple aggregation paradigm for further study. Code will be available at (\href{https://github.com/happydummy/DBStereo}{https://github.com/happydummy/DBStereo}) soon.
format Preprint
id arxiv_https___arxiv_org_abs_2509_02415
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Decoupling Bidirectional Geometric Representations of 4D cost volume with 2D convolution
Wei, Xiaobao
Shu, Changyong
Yue, Zhaokun
Huang, Chang
Liu, Weiwei
Yang, Shuai
Yang, Lirong
Gao, Peng
Zhang, Wenbin
Zhu, Gaochao
Wang, Chengxiang
Computer Vision and Pattern Recognition
High-performance real-time stereo matching methods invariably rely on 3D regularization of the cost volume, which is unfriendly to mobile devices. And 2D regularization based methods struggle in ill-posed regions. In this paper, we present a deployment-friendly 4D cost aggregation network DBStereo, which is based on pure 2D convolutions. Specifically, we first provide a thorough analysis of the decoupling characteristics of 4D cost volume. And design a lightweight bidirectional geometry aggregation block to capture spatial and disparity representation respectively. Through decoupled learning, our approach achieves real-time performance and impressive accuracy simultaneously. Extensive experiments demonstrate that our proposed DBStereo outperforms all existing aggregation-based methods in both inference time and accuracy, even surpassing the iterative-based method IGEV-Stereo. Our study break the empirical design of using 3D convolutions for 4D cost volume and provides a simple yet strong baseline of the proposed decouple aggregation paradigm for further study. Code will be available at (\href{https://github.com/happydummy/DBStereo}{https://github.com/happydummy/DBStereo}) soon.
title Decoupling Bidirectional Geometric Representations of 4D cost volume with 2D convolution
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2509.02415