Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Fan, Rizhao, Ma, Tianfang, Li, Zhigen, An, Ning, Cheng, Jian
Format:	Preprint
Publié:	2025
Sujets:	Computer Vision and Pattern Recognition
Accès en ligne:	https://arxiv.org/abs/2507.14879
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866912492980011008
author	Fan, Rizhao Ma, Tianfang Li, Zhigen An, Ning Cheng, Jian
author_facet	Fan, Rizhao Ma, Tianfang Li, Zhigen An, Ning Cheng, Jian
contents	In recent years, the emergence of foundation models for depth prediction has led to remarkable progress, particularly in zero-shot monocular depth estimation. These models generate impressive depth predictions; however, their outputs are often in relative scale rather than metric scale. This limitation poses challenges for direct deployment in real-world applications. To address this, several scale adaptation methods have been proposed to enable foundation models to produce metric depth. However, these methods are typically costly, as they require additional training on new domains and datasets. Moreover, fine-tuning these models often compromises their original generalization capabilities, limiting their adaptability across diverse scenes. In this paper, we introduce a non-learning-based approach that leverages sparse depth measurements to adapt the relative-scale predictions of foundation models into metric-scale depth. Our method requires neither retraining nor fine-tuning, thereby preserving the strong generalization ability of the original foundation models while enabling them to produce metric depth. Experimental results demonstrate the effectiveness of our approach, high-lighting its potential to bridge the gap between relative and metric depth without incurring additional computational costs or sacrificing generalization ability.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_14879
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Region-aware Depth Scale Adaptation with Sparse Measurements Fan, Rizhao Ma, Tianfang Li, Zhigen An, Ning Cheng, Jian Computer Vision and Pattern Recognition In recent years, the emergence of foundation models for depth prediction has led to remarkable progress, particularly in zero-shot monocular depth estimation. These models generate impressive depth predictions; however, their outputs are often in relative scale rather than metric scale. This limitation poses challenges for direct deployment in real-world applications. To address this, several scale adaptation methods have been proposed to enable foundation models to produce metric depth. However, these methods are typically costly, as they require additional training on new domains and datasets. Moreover, fine-tuning these models often compromises their original generalization capabilities, limiting their adaptability across diverse scenes. In this paper, we introduce a non-learning-based approach that leverages sparse depth measurements to adapt the relative-scale predictions of foundation models into metric-scale depth. Our method requires neither retraining nor fine-tuning, thereby preserving the strong generalization ability of the original foundation models while enabling them to produce metric depth. Experimental results demonstrate the effectiveness of our approach, high-lighting its potential to bridge the gap between relative and metric depth without incurring additional computational costs or sacrificing generalization ability.
title	Region-aware Depth Scale Adaptation with Sparse Measurements
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2507.14879

Documents similaires