Enregistré dans:
Détails bibliographiques
Auteurs principaux: Fan, Rizhao, Ma, Tianfang, Li, Zhigen, An, Ning, Cheng, Jian
Format: Preprint
Publié: 2025
Sujets:
Accès en ligne:https://arxiv.org/abs/2507.14879
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866912492980011008
author Fan, Rizhao
Ma, Tianfang
Li, Zhigen
An, Ning
Cheng, Jian
author_facet Fan, Rizhao
Ma, Tianfang
Li, Zhigen
An, Ning
Cheng, Jian
contents In recent years, the emergence of foundation models for depth prediction has led to remarkable progress, particularly in zero-shot monocular depth estimation. These models generate impressive depth predictions; however, their outputs are often in relative scale rather than metric scale. This limitation poses challenges for direct deployment in real-world applications. To address this, several scale adaptation methods have been proposed to enable foundation models to produce metric depth. However, these methods are typically costly, as they require additional training on new domains and datasets. Moreover, fine-tuning these models often compromises their original generalization capabilities, limiting their adaptability across diverse scenes. In this paper, we introduce a non-learning-based approach that leverages sparse depth measurements to adapt the relative-scale predictions of foundation models into metric-scale depth. Our method requires neither retraining nor fine-tuning, thereby preserving the strong generalization ability of the original foundation models while enabling them to produce metric depth. Experimental results demonstrate the effectiveness of our approach, high-lighting its potential to bridge the gap between relative and metric depth without incurring additional computational costs or sacrificing generalization ability.
format Preprint
id arxiv_https___arxiv_org_abs_2507_14879
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Region-aware Depth Scale Adaptation with Sparse Measurements
Fan, Rizhao
Ma, Tianfang
Li, Zhigen
An, Ning
Cheng, Jian
Computer Vision and Pattern Recognition
In recent years, the emergence of foundation models for depth prediction has led to remarkable progress, particularly in zero-shot monocular depth estimation. These models generate impressive depth predictions; however, their outputs are often in relative scale rather than metric scale. This limitation poses challenges for direct deployment in real-world applications. To address this, several scale adaptation methods have been proposed to enable foundation models to produce metric depth. However, these methods are typically costly, as they require additional training on new domains and datasets. Moreover, fine-tuning these models often compromises their original generalization capabilities, limiting their adaptability across diverse scenes. In this paper, we introduce a non-learning-based approach that leverages sparse depth measurements to adapt the relative-scale predictions of foundation models into metric-scale depth. Our method requires neither retraining nor fine-tuning, thereby preserving the strong generalization ability of the original foundation models while enabling them to produce metric depth. Experimental results demonstrate the effectiveness of our approach, high-lighting its potential to bridge the gap between relative and metric depth without incurring additional computational costs or sacrificing generalization ability.
title Region-aware Depth Scale Adaptation with Sparse Measurements
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2507.14879