Saved in:
Bibliographic Details
Main Authors: Wang, Shenran, Yang, Changbing, Parkhill, Mike, Quinn, Chad, Hammerly, Christopher, Zhu, Jian
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.02703
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • We present lightweight flow matching multilingual text-to-speech (TTS) systems for Ojibwe, Mi'kmaq, and Maliseet, three Indigenous languages in North America. Our results show that training a multilingual TTS model on three typologically similar languages can improve the performance over monolingual models, especially when data are scarce. Attention-free architectures are highly competitive with self-attention architecture with higher memory efficiency. Our research not only advances technical development for the revitalization of low-resource languages but also highlights the cultural gap in human evaluation protocols, calling for a more community-centered approach to human evaluation.