Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Downey, C. M., Blevins, Terra, Serai, Dhwani, Parikh, Dwija, Steinert-Threlkeld, Shane
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2405.12413
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913357782581248
author	Downey, C. M. Blevins, Terra Serai, Dhwani Parikh, Dwija Steinert-Threlkeld, Shane
author_facet	Downey, C. M. Blevins, Terra Serai, Dhwani Parikh, Dwija Steinert-Threlkeld, Shane
contents	The "massively-multilingual" training of multilingual models is known to limit their utility in any one language, and they perform particularly poorly on low-resource languages. However, there is evidence that low-resource languages can benefit from targeted multilinguality, where the model is trained on closely related languages. To test this approach more rigorously, we systematically study best practices for adapting a pre-trained model to a language family. Focusing on the Uralic family as a test case, we adapt XLM-R under various configurations to model 15 languages; we then evaluate the performance of each experimental setting on two downstream tasks and 11 evaluation languages. Our adapted models significantly outperform mono- and multilingual baselines. Furthermore, a regression analysis of hyperparameter effects reveals that adapted vocabulary size is relatively unimportant for low-resource languages, and that low-resource languages can be aggressively up-sampled during training at little detriment to performance in high-resource languages. These results introduce new best practices for performing language adaptation in a targeted setting.
format	Preprint
id	arxiv_https___arxiv_org_abs_2405_12413
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Targeted Multilingual Adaptation for Low-resource Language Families Downey, C. M. Blevins, Terra Serai, Dhwani Parikh, Dwija Steinert-Threlkeld, Shane Computation and Language The "massively-multilingual" training of multilingual models is known to limit their utility in any one language, and they perform particularly poorly on low-resource languages. However, there is evidence that low-resource languages can benefit from targeted multilinguality, where the model is trained on closely related languages. To test this approach more rigorously, we systematically study best practices for adapting a pre-trained model to a language family. Focusing on the Uralic family as a test case, we adapt XLM-R under various configurations to model 15 languages; we then evaluate the performance of each experimental setting on two downstream tasks and 11 evaluation languages. Our adapted models significantly outperform mono- and multilingual baselines. Furthermore, a regression analysis of hyperparameter effects reveals that adapted vocabulary size is relatively unimportant for low-resource languages, and that low-resource languages can be aggressively up-sampled during training at little detriment to performance in high-resource languages. These results introduce new best practices for performing language adaptation in a targeted setting.
title	Targeted Multilingual Adaptation for Low-resource Language Families
topic	Computation and Language
url	https://arxiv.org/abs/2405.12413

Similar Items