Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Shi, Boyu, Zhou, Junbo, Liu, Chang, Yang, Xu, Wang, Qiufeng, Geng, Xin
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Machine Learning
Online-Zugang:	https://arxiv.org/abs/2605.08209
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866914545775149056
author	Shi, Boyu Zhou, Junbo Liu, Chang Yang, Xu Wang, Qiufeng Geng, Xin
author_facet	Shi, Boyu Zhou, Junbo Liu, Chang Yang, Xu Wang, Qiufeng Geng, Xin
contents	Deep learning methods are widely used under diverse resource constraints, resulting in models of varying sizes, such as the Vision Transformer (ViT) series. Deploying these models typically requires costly pretraining and finetuning. The Learngene paradigm addresses this issue by extracting transferable components, called learngenes, from a pretrained ancestry model (Ans-Net) to initialize variable-sized descendant models (Des-Nets).Existing learngene extraction methods rely on a single dataset, limiting downstream performance. To address this limitation, we propose Learngene Search Across Multiple Datasets for Building Variable-Sized Models (LSAMD). LSAMD expands the Ans-Net into a searchable super Ans-Net with dataset-specific blocks and dataset adapters (DADs). During training, LSAMD searches for an optimal architecture path for each dataset. The base blocks most frequently selected across datasets are extracted as learngenes for initializing Des-Nets.Experiments on multiple datasets show that LSAMD achieves performance comparable to pretrain-finetune methods while significantly reducing storage and training costs.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_08209
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Learngene Search Across Multiple Datasets for Building Variable-Sized Models Shi, Boyu Zhou, Junbo Liu, Chang Yang, Xu Wang, Qiufeng Geng, Xin Machine Learning Deep learning methods are widely used under diverse resource constraints, resulting in models of varying sizes, such as the Vision Transformer (ViT) series. Deploying these models typically requires costly pretraining and finetuning. The Learngene paradigm addresses this issue by extracting transferable components, called learngenes, from a pretrained ancestry model (Ans-Net) to initialize variable-sized descendant models (Des-Nets).Existing learngene extraction methods rely on a single dataset, limiting downstream performance. To address this limitation, we propose Learngene Search Across Multiple Datasets for Building Variable-Sized Models (LSAMD). LSAMD expands the Ans-Net into a searchable super Ans-Net with dataset-specific blocks and dataset adapters (DADs). During training, LSAMD searches for an optimal architecture path for each dataset. The base blocks most frequently selected across datasets are extracted as learngenes for initializing Des-Nets.Experiments on multiple datasets show that LSAMD achieves performance comparable to pretrain-finetune methods while significantly reducing storage and training costs.
title	Learngene Search Across Multiple Datasets for Building Variable-Sized Models
topic	Machine Learning
url	https://arxiv.org/abs/2605.08209

Ähnliche Einträge