Guardado en:
Detalles Bibliográficos
Autores principales: Zhu, Shengjie, Abdelkader, Ahmed, Matthews, Mark J., Liu, Xiaoming, Chu, Wen-Sheng
Formato: Preprint
Publicado: 2026
Materias:
Acceso en línea:https://arxiv.org/abs/2602.18906
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866918349648166912
author Zhu, Shengjie
Abdelkader, Ahmed
Matthews, Mark J.
Liu, Xiaoming
Chu, Wen-Sheng
author_facet Zhu, Shengjie
Abdelkader, Ahmed
Matthews, Mark J.
Liu, Xiaoming
Chu, Wen-Sheng
contents Structure-from-Motion (SfM) is a fundamental 3D vision task for recovering camera parameters and scene geometry from multi-view images. While recent deep learning advances enable accurate Monocular Depth Estimation (MDE) from single images without depending on camera motion, integrating MDE into SfM remains a challenge. Unlike conventional triangulated sparse point clouds, MDE produces dense depth maps with significantly higher error variance. Inspired by modern RANSAC estimators, we propose Marginalized Bundle Adjustment (MBA) to mitigate MDE error variance leveraging its density. With MBA, we show that MDE depth maps are sufficiently accurate to yield SoTA or competitive results in SfM and camera relocalization tasks. Through extensive evaluations, we demonstrate consistently robust performance across varying scales, ranging from few-frame setups to large multi-view systems with thousands of images. Our method highlights the significant potential of MDE in multi-view 3D vision.
format Preprint
id arxiv_https___arxiv_org_abs_2602_18906
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Marginalized Bundle Adjustment: Multi-View Camera Pose from Monocular Depth Estimates
Zhu, Shengjie
Abdelkader, Ahmed
Matthews, Mark J.
Liu, Xiaoming
Chu, Wen-Sheng
Computer Vision and Pattern Recognition
Structure-from-Motion (SfM) is a fundamental 3D vision task for recovering camera parameters and scene geometry from multi-view images. While recent deep learning advances enable accurate Monocular Depth Estimation (MDE) from single images without depending on camera motion, integrating MDE into SfM remains a challenge. Unlike conventional triangulated sparse point clouds, MDE produces dense depth maps with significantly higher error variance. Inspired by modern RANSAC estimators, we propose Marginalized Bundle Adjustment (MBA) to mitigate MDE error variance leveraging its density. With MBA, we show that MDE depth maps are sufficiently accurate to yield SoTA or competitive results in SfM and camera relocalization tasks. Through extensive evaluations, we demonstrate consistently robust performance across varying scales, ranging from few-frame setups to large multi-view systems with thousands of images. Our method highlights the significant potential of MDE in multi-view 3D vision.
title Marginalized Bundle Adjustment: Multi-View Camera Pose from Monocular Depth Estimates
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2602.18906