MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Nguyen, The-Hai, Huu-Tien, Dang, Suzuki, Takeshi, Nguyen, Le-Minh
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Machine Learning
Accesso online:	https://arxiv.org/abs/2508.03121
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866918466930343936
author	Nguyen, The-Hai Huu-Tien, Dang Suzuki, Takeshi Nguyen, Le-Minh
author_facet	Nguyen, The-Hai Huu-Tien, Dang Suzuki, Takeshi Nguyen, Le-Minh
contents	Regression Mean (RegMean), an approach that formulates model merging as a linear regression problem, aims to find the optimal weights for each linear layer in the merged model by minimizing the discrepancy in predictions between the merged and candidate models. RegMean provides a precise closed-form solution for the merging problem; therefore, it offers explainability and computational efficiency. However, RegMean merges each linear layer independently, overlooking how the features and information in earlier layers propagate through deeper layers and influence the final predictions of the merged model. Here, we introduce RegMean++, a simple yet effective alternative to RegMean, that explicitly incorporates both intra-layer and cross-layer dependencies between merged models' layers into RegMean's objective. By accounting for these dependencies, RegMean++ better captures the behaviors of the merged model. Extensive experiments demonstrate that RegMean++ consistently outperforms RegMean across diverse settings, including in-domain (ID) and out-of-domain (OOD) generalization, sequential merging, large-scale tasks, and robustness under several types of distribution shifts. Furthermore, RegMean++ achieves competitive performance across diverse settings compared to various advanced model merging methods.
format	Preprint
id	arxiv_https___arxiv_org_abs_2508_03121
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	RegMean++: Enhancing Effectiveness and Generalization of Regression Mean for Model Merging Nguyen, The-Hai Huu-Tien, Dang Suzuki, Takeshi Nguyen, Le-Minh Machine Learning Regression Mean (RegMean), an approach that formulates model merging as a linear regression problem, aims to find the optimal weights for each linear layer in the merged model by minimizing the discrepancy in predictions between the merged and candidate models. RegMean provides a precise closed-form solution for the merging problem; therefore, it offers explainability and computational efficiency. However, RegMean merges each linear layer independently, overlooking how the features and information in earlier layers propagate through deeper layers and influence the final predictions of the merged model. Here, we introduce RegMean++, a simple yet effective alternative to RegMean, that explicitly incorporates both intra-layer and cross-layer dependencies between merged models' layers into RegMean's objective. By accounting for these dependencies, RegMean++ better captures the behaviors of the merged model. Extensive experiments demonstrate that RegMean++ consistently outperforms RegMean across diverse settings, including in-domain (ID) and out-of-domain (OOD) generalization, sequential merging, large-scale tasks, and robustness under several types of distribution shifts. Furthermore, RegMean++ achieves competitive performance across diverse settings compared to various advanced model merging methods.
title	RegMean++: Enhancing Effectiveness and Generalization of Regression Mean for Model Merging
topic	Machine Learning
url	https://arxiv.org/abs/2508.03121

Documenti analoghi