Saved in:
Bibliographic Details
Main Authors: Xu, Jing, Wu, Minglin, Wu, Xixin, Meng, Helen
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.14092
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914000117170176
author Xu, Jing
Wu, Minglin
Wu, Xixin
Meng, Helen
author_facet Xu, Jing
Wu, Minglin
Wu, Xixin
Meng, Helen
contents Self-supervised (SSL) models have shown great performance in various downstream tasks. However, they are typically developed for limited languages, and may encounter new languages in real-world. Developing a SSL model for each new language is costly. Thus, it is vital to figure out how to efficiently adapt existed SSL models to a new language without impairing its original abilities. We propose adaptation methods which integrate LoRA to existed SSL models to extend new language. We also develop preservation strategies which include data combination and re-clustering to retain abilities on existed languages. Applied to mHuBERT, we investigate their effectiveness on speech re-synthesis task. Experiments show that our adaptation methods enable mHuBERT to be applied to a new language (Mandarin) with MOS value increased about 1.6 and the relative value of WER reduced up to 61.72%. Also, our preservation strategies ensure that the performance on both existed and new languages remains intact.
format Preprint
id arxiv_https___arxiv_org_abs_2406_14092
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Seamless Language Expansion: Enhancing Multilingual Mastery in Self-Supervised Models
Xu, Jing
Wu, Minglin
Wu, Xixin
Meng, Helen
Computation and Language
Audio and Speech Processing
Self-supervised (SSL) models have shown great performance in various downstream tasks. However, they are typically developed for limited languages, and may encounter new languages in real-world. Developing a SSL model for each new language is costly. Thus, it is vital to figure out how to efficiently adapt existed SSL models to a new language without impairing its original abilities. We propose adaptation methods which integrate LoRA to existed SSL models to extend new language. We also develop preservation strategies which include data combination and re-clustering to retain abilities on existed languages. Applied to mHuBERT, we investigate their effectiveness on speech re-synthesis task. Experiments show that our adaptation methods enable mHuBERT to be applied to a new language (Mandarin) with MOS value increased about 1.6 and the relative value of WER reduced up to 61.72%. Also, our preservation strategies ensure that the performance on both existed and new languages remains intact.
title Seamless Language Expansion: Enhancing Multilingual Mastery in Self-Supervised Models
topic Computation and Language
Audio and Speech Processing
url https://arxiv.org/abs/2406.14092