Guardado en:
| Autores principales: | , , , , |
|---|---|
| Formato: | Preprint |
| Publicado: |
2026
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2602.18906 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
| _version_ | 1866918349648166912 |
|---|---|
| author | Zhu, Shengjie Abdelkader, Ahmed Matthews, Mark J. Liu, Xiaoming Chu, Wen-Sheng |
| author_facet | Zhu, Shengjie Abdelkader, Ahmed Matthews, Mark J. Liu, Xiaoming Chu, Wen-Sheng |
| contents | Structure-from-Motion (SfM) is a fundamental 3D vision task for recovering camera parameters and scene geometry from multi-view images. While recent deep learning advances enable accurate Monocular Depth Estimation (MDE) from single images without depending on camera motion, integrating MDE into SfM remains a challenge. Unlike conventional triangulated sparse point clouds, MDE produces dense depth maps with significantly higher error variance. Inspired by modern RANSAC estimators, we propose Marginalized Bundle Adjustment (MBA) to mitigate MDE error variance leveraging its density. With MBA, we show that MDE depth maps are sufficiently accurate to yield SoTA or competitive results in SfM and camera relocalization tasks. Through extensive evaluations, we demonstrate consistently robust performance across varying scales, ranging from few-frame setups to large multi-view systems with thousands of images. Our method highlights the significant potential of MDE in multi-view 3D vision. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2602_18906 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Marginalized Bundle Adjustment: Multi-View Camera Pose from Monocular Depth Estimates Zhu, Shengjie Abdelkader, Ahmed Matthews, Mark J. Liu, Xiaoming Chu, Wen-Sheng Computer Vision and Pattern Recognition Structure-from-Motion (SfM) is a fundamental 3D vision task for recovering camera parameters and scene geometry from multi-view images. While recent deep learning advances enable accurate Monocular Depth Estimation (MDE) from single images without depending on camera motion, integrating MDE into SfM remains a challenge. Unlike conventional triangulated sparse point clouds, MDE produces dense depth maps with significantly higher error variance. Inspired by modern RANSAC estimators, we propose Marginalized Bundle Adjustment (MBA) to mitigate MDE error variance leveraging its density. With MBA, we show that MDE depth maps are sufficiently accurate to yield SoTA or competitive results in SfM and camera relocalization tasks. Through extensive evaluations, we demonstrate consistently robust performance across varying scales, ranging from few-frame setups to large multi-view systems with thousands of images. Our method highlights the significant potential of MDE in multi-view 3D vision. |
| title | Marginalized Bundle Adjustment: Multi-View Camera Pose from Monocular Depth Estimates |
| topic | Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2602.18906 |