Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	DeMarco, Andrea, Conti, Ian Fenech, Camilleri, Hayley, Bushi, Ardiana, Riggi, Simone
Formato:	Preprint
Publicado:	2026
Materias:	Instrumentation and Methods for Astrophysics Computer Vision and Pattern Recognition
Acceso en línea:	https://arxiv.org/abs/2603.29660
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866917391092416512
author	DeMarco, Andrea Conti, Ian Fenech Camilleri, Hayley Bushi, Ardiana Riggi, Simone
author_facet	DeMarco, Andrea Conti, Ian Fenech Camilleri, Hayley Bushi, Ardiana Riggi, Simone
contents	Next-generation radio astronomy surveys are delivering millions of resolved sources, but robust and scalable morphology analysis remains difficult across heterogeneous telescopes and imaging pipelines. We present STRADAViT, a self-supervised Vision Transformer continued-pretraining framework for learning transferable encoders from radio astronomy imagery. The framework combines mixed-survey data curation, radio astronomy-aware training-view generation, and a ViT-MAE-initialized encoder family with optional register tokens. It supports reconstruction-only, contrastive-only, and two-stage branches. Our pretraining dataset comprises radio astronomy cutouts drawn from four complementary sources. We evaluate transfer with linear probing and fine-tuning on three morphology benchmarks spanning binary and multi-class settings. Relative to the ViT-MAE initialization used for continued pretraining, the best two-stage models improve Macro-F1 in all reported linear-probe settings and in two of three fine-tuning settings, with the largest gain on RGZ DR1. Relative to DINOv2, gains are selective rather than universal: the best two-stage models achieve higher mean Macro-F1 than the strongest DINOv2 baseline on LoTSS DR2 and RGZ DR1 under linear probing, and on MiraBest and RGZ DR1 under fine-tuning. A targeted DINOv2 initialization ablation further indicates that the adaptation recipe is not specific to the ViT-MAE starting point and that, under the same recipe. The ViT-MAE-based STRADAViT checkpoint is retained as the released checkpoint because it combines competitive transfer with substantially lower token count and downstream cost than the DINOv2-based alternative. These results indicate that radio astronomy-aware view generation and staged continued pretraining can provide a stronger domain-adapted starting point than off-the-shelf ViT checkpoints for radio astronomy transfer.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_29660
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	STRADAViT: Towards a Foundational Model for Radio Astronomy through Self-Supervised Transfer DeMarco, Andrea Conti, Ian Fenech Camilleri, Hayley Bushi, Ardiana Riggi, Simone Instrumentation and Methods for Astrophysics Computer Vision and Pattern Recognition Next-generation radio astronomy surveys are delivering millions of resolved sources, but robust and scalable morphology analysis remains difficult across heterogeneous telescopes and imaging pipelines. We present STRADAViT, a self-supervised Vision Transformer continued-pretraining framework for learning transferable encoders from radio astronomy imagery. The framework combines mixed-survey data curation, radio astronomy-aware training-view generation, and a ViT-MAE-initialized encoder family with optional register tokens. It supports reconstruction-only, contrastive-only, and two-stage branches. Our pretraining dataset comprises radio astronomy cutouts drawn from four complementary sources. We evaluate transfer with linear probing and fine-tuning on three morphology benchmarks spanning binary and multi-class settings. Relative to the ViT-MAE initialization used for continued pretraining, the best two-stage models improve Macro-F1 in all reported linear-probe settings and in two of three fine-tuning settings, with the largest gain on RGZ DR1. Relative to DINOv2, gains are selective rather than universal: the best two-stage models achieve higher mean Macro-F1 than the strongest DINOv2 baseline on LoTSS DR2 and RGZ DR1 under linear probing, and on MiraBest and RGZ DR1 under fine-tuning. A targeted DINOv2 initialization ablation further indicates that the adaptation recipe is not specific to the ViT-MAE starting point and that, under the same recipe. The ViT-MAE-based STRADAViT checkpoint is retained as the released checkpoint because it combines competitive transfer with substantially lower token count and downstream cost than the DINOv2-based alternative. These results indicate that radio astronomy-aware view generation and staged continued pretraining can provide a stronger domain-adapted starting point than off-the-shelf ViT checkpoints for radio astronomy transfer.
title	STRADAViT: Towards a Foundational Model for Radio Astronomy through Self-Supervised Transfer
topic	Instrumentation and Methods for Astrophysics Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2603.29660

Ejemplares similares