Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Vanyan, Ani, Barseghyan, Alvard, Tamazyan, Hakob, Galstyan, Tigran, Huroyan, Vahan, Hovakimyan, Naira, Khachatrian, Hrant
Formato:	Preprint
Publicado:	2025
Materias:	Computer Vision and Pattern Recognition
Acceso en línea:	https://arxiv.org/abs/2510.17014
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866911220433420288
author	Vanyan, Ani Barseghyan, Alvard Tamazyan, Hakob Galstyan, Tigran Huroyan, Vahan Hovakimyan, Naira Khachatrian, Hrant
author_facet	Vanyan, Ani Barseghyan, Alvard Tamazyan, Hakob Galstyan, Tigran Huroyan, Vahan Hovakimyan, Naira Khachatrian, Hrant
contents	Foundation models have advanced machine learning across various modalities, including images. Recently multiple teams trained foundation models specialized for remote sensing applications. This line of research is motivated by the distinct characteristics of remote sensing imagery, specific applications and types of robustness useful for satellite image analysis. In this work we systematically challenge the idea that specific foundation models are more useful than general-purpose vision foundation models, at least in the small scale. First, we design a simple benchmark that measures generalization of remote sensing models towards images with lower resolution for two downstream tasks. Second, we train iBOT, a self-supervised vision encoder, on MillionAID, an ImageNet-scale satellite imagery dataset, with several modifications specific to remote sensing. We show that none of those pretrained models bring consistent improvements upon general-purpose baselines at the ViT-B scale.
format	Preprint
id	arxiv_https___arxiv_org_abs_2510_17014
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Do Satellite Tasks Need Special Pretraining? Vanyan, Ani Barseghyan, Alvard Tamazyan, Hakob Galstyan, Tigran Huroyan, Vahan Hovakimyan, Naira Khachatrian, Hrant Computer Vision and Pattern Recognition Foundation models have advanced machine learning across various modalities, including images. Recently multiple teams trained foundation models specialized for remote sensing applications. This line of research is motivated by the distinct characteristics of remote sensing imagery, specific applications and types of robustness useful for satellite image analysis. In this work we systematically challenge the idea that specific foundation models are more useful than general-purpose vision foundation models, at least in the small scale. First, we design a simple benchmark that measures generalization of remote sensing models towards images with lower resolution for two downstream tasks. Second, we train iBOT, a self-supervised vision encoder, on MillionAID, an ImageNet-scale satellite imagery dataset, with several modifications specific to remote sensing. We show that none of those pretrained models bring consistent improvements upon general-purpose baselines at the ViT-B scale.
title	Do Satellite Tasks Need Special Pretraining?
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2510.17014

Ejemplares similares