Saved in:
Bibliographic Details
Main Authors: Rafkin, Emma, DeGenaro, Dan, Yang, Xiulin
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.07038
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912966848282624
author Rafkin, Emma
DeGenaro, Dan
Yang, Xiulin
author_facet Rafkin, Emma
DeGenaro, Dan
Yang, Xiulin
contents The development of resource-constrained approaches to automatic speech recognition (ASR) is of great interest due to its broad applicability to many low-resource languages for which there is scant usable data. Existing approaches to many low-resource natural language processing tasks leverage additional data from higher-resource languages that are closely related to a target low-resource language. One increasingly popular approach uses task arithmetic to combine models trained on different tasks to create a model for a task where there is little to no training data. In this paper, we consider training on a particular language to be a task, and we generate task vectors by fine-tuning variants of the Whisper ASR system. For pairs of high- and low-resource languages, we merge task vectors via a linear combination which is optimized on the downstream word error rate on the low-resource target language's validation set. Across 23 low-resource target languages for which we evaluate this technique, we find consistent word error rate improvements of up to 10% compared to a baseline without our approach.
format Preprint
id arxiv_https___arxiv_org_abs_2601_07038
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Task Arithmetic with Support Languages for Low-Resource ASR
Rafkin, Emma
DeGenaro, Dan
Yang, Xiulin
Computation and Language
The development of resource-constrained approaches to automatic speech recognition (ASR) is of great interest due to its broad applicability to many low-resource languages for which there is scant usable data. Existing approaches to many low-resource natural language processing tasks leverage additional data from higher-resource languages that are closely related to a target low-resource language. One increasingly popular approach uses task arithmetic to combine models trained on different tasks to create a model for a task where there is little to no training data. In this paper, we consider training on a particular language to be a task, and we generate task vectors by fine-tuning variants of the Whisper ASR system. For pairs of high- and low-resource languages, we merge task vectors via a linear combination which is optimized on the downstream word error rate on the low-resource target language's validation set. Across 23 low-resource target languages for which we evaluate this technique, we find consistent word error rate improvements of up to 10% compared to a baseline without our approach.
title Task Arithmetic with Support Languages for Low-Resource ASR
topic Computation and Language
url https://arxiv.org/abs/2601.07038