Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Siriwardena, Yashish M., Swedlow, Nathan, Howard, Audrey, Gitterman, Evan, Darcy, Dan, Espy-Wilson, Carol, Fanelli, Andrea
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2406.05947
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929380210507776
author	Siriwardena, Yashish M. Swedlow, Nathan Howard, Audrey Gitterman, Evan Darcy, Dan Espy-Wilson, Carol Fanelli, Andrea
author_facet	Siriwardena, Yashish M. Swedlow, Nathan Howard, Audrey Gitterman, Evan Darcy, Dan Espy-Wilson, Carol Fanelli, Andrea
contents	Conversion of non-native accented speech to native (American) English has a wide range of applications such as improving intelligibility of non-native speech. Previous work on this domain has used phonetic posteriograms as the target speech representation to train an acoustic model which is then used to extract a compact representation of input speech for accent conversion. In this work, we introduce the idea of using an effective articulatory speech representation, extracted from an acoustic-to-articulatory speech inversion system, to improve the acoustic model used in accent conversion. The idea to incorporate articulatory representations originates from their ability to well characterize accents in speech. To incorporate articulatory representations with conventional phonetic posteriograms, a multi-task learning based acoustic model is proposed. Objective and subjective evaluations show that the use of articulatory representations can improve the effectiveness of accent conversion.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_05947
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Accent Conversion with Articulatory Representations Siriwardena, Yashish M. Swedlow, Nathan Howard, Audrey Gitterman, Evan Darcy, Dan Espy-Wilson, Carol Fanelli, Andrea Audio and Speech Processing Conversion of non-native accented speech to native (American) English has a wide range of applications such as improving intelligibility of non-native speech. Previous work on this domain has used phonetic posteriograms as the target speech representation to train an acoustic model which is then used to extract a compact representation of input speech for accent conversion. In this work, we introduce the idea of using an effective articulatory speech representation, extracted from an acoustic-to-articulatory speech inversion system, to improve the acoustic model used in accent conversion. The idea to incorporate articulatory representations originates from their ability to well characterize accents in speech. To incorporate articulatory representations with conventional phonetic posteriograms, a multi-task learning based acoustic model is proposed. Objective and subjective evaluations show that the use of articulatory representations can improve the effectiveness of accent conversion.
title	Accent Conversion with Articulatory Representations
topic	Audio and Speech Processing
url	https://arxiv.org/abs/2406.05947

Similar Items