Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Shi, Boyu, Zhou, Junbo, Liu, Chang, Yang, Xu, Wang, Qiufeng, Geng, Xin
Format: Preprint
Veröffentlicht: 2026
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2605.08209
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866914545775149056
author Shi, Boyu
Zhou, Junbo
Liu, Chang
Yang, Xu
Wang, Qiufeng
Geng, Xin
author_facet Shi, Boyu
Zhou, Junbo
Liu, Chang
Yang, Xu
Wang, Qiufeng
Geng, Xin
contents Deep learning methods are widely used under diverse resource constraints, resulting in models of varying sizes, such as the Vision Transformer (ViT) series. Deploying these models typically requires costly pretraining and finetuning. The Learngene paradigm addresses this issue by extracting transferable components, called learngenes, from a pretrained ancestry model (Ans-Net) to initialize variable-sized descendant models (Des-Nets).Existing learngene extraction methods rely on a single dataset, limiting downstream performance. To address this limitation, we propose Learngene Search Across Multiple Datasets for Building Variable-Sized Models (LSAMD). LSAMD expands the Ans-Net into a searchable super Ans-Net with dataset-specific blocks and dataset adapters (DADs). During training, LSAMD searches for an optimal architecture path for each dataset. The base blocks most frequently selected across datasets are extracted as learngenes for initializing Des-Nets.Experiments on multiple datasets show that LSAMD achieves performance comparable to pretrain-finetune methods while significantly reducing storage and training costs.
format Preprint
id arxiv_https___arxiv_org_abs_2605_08209
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Learngene Search Across Multiple Datasets for Building Variable-Sized Models
Shi, Boyu
Zhou, Junbo
Liu, Chang
Yang, Xu
Wang, Qiufeng
Geng, Xin
Machine Learning
Deep learning methods are widely used under diverse resource constraints, resulting in models of varying sizes, such as the Vision Transformer (ViT) series. Deploying these models typically requires costly pretraining and finetuning. The Learngene paradigm addresses this issue by extracting transferable components, called learngenes, from a pretrained ancestry model (Ans-Net) to initialize variable-sized descendant models (Des-Nets).Existing learngene extraction methods rely on a single dataset, limiting downstream performance. To address this limitation, we propose Learngene Search Across Multiple Datasets for Building Variable-Sized Models (LSAMD). LSAMD expands the Ans-Net into a searchable super Ans-Net with dataset-specific blocks and dataset adapters (DADs). During training, LSAMD searches for an optimal architecture path for each dataset. The base blocks most frequently selected across datasets are extracted as learngenes for initializing Des-Nets.Experiments on multiple datasets show that LSAMD achieves performance comparable to pretrain-finetune methods while significantly reducing storage and training costs.
title Learngene Search Across Multiple Datasets for Building Variable-Sized Models
topic Machine Learning
url https://arxiv.org/abs/2605.08209