Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Rafkin, Emma, DeGenaro, Dan, Yang, Xiulin
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2601.07038
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912966848282624
author	Rafkin, Emma DeGenaro, Dan Yang, Xiulin
author_facet	Rafkin, Emma DeGenaro, Dan Yang, Xiulin
contents	The development of resource-constrained approaches to automatic speech recognition (ASR) is of great interest due to its broad applicability to many low-resource languages for which there is scant usable data. Existing approaches to many low-resource natural language processing tasks leverage additional data from higher-resource languages that are closely related to a target low-resource language. One increasingly popular approach uses task arithmetic to combine models trained on different tasks to create a model for a task where there is little to no training data. In this paper, we consider training on a particular language to be a task, and we generate task vectors by fine-tuning variants of the Whisper ASR system. For pairs of high- and low-resource languages, we merge task vectors via a linear combination which is optimized on the downstream word error rate on the low-resource target language's validation set. Across 23 low-resource target languages for which we evaluate this technique, we find consistent word error rate improvements of up to 10% compared to a baseline without our approach.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_07038
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Task Arithmetic with Support Languages for Low-Resource ASR Rafkin, Emma DeGenaro, Dan Yang, Xiulin Computation and Language The development of resource-constrained approaches to automatic speech recognition (ASR) is of great interest due to its broad applicability to many low-resource languages for which there is scant usable data. Existing approaches to many low-resource natural language processing tasks leverage additional data from higher-resource languages that are closely related to a target low-resource language. One increasingly popular approach uses task arithmetic to combine models trained on different tasks to create a model for a task where there is little to no training data. In this paper, we consider training on a particular language to be a task, and we generate task vectors by fine-tuning variants of the Whisper ASR system. For pairs of high- and low-resource languages, we merge task vectors via a linear combination which is optimized on the downstream word error rate on the low-resource target language's validation set. Across 23 low-resource target languages for which we evaluate this technique, we find consistent word error rate improvements of up to 10% compared to a baseline without our approach.
title	Task Arithmetic with Support Languages for Low-Resource ASR
topic	Computation and Language
url	https://arxiv.org/abs/2601.07038

Similar Items