Saved in:
Bibliographic Details
Main Authors: Tabatabaee, Saba, Boyce, Suzanne, Oren, Liran, Tiede, Mark, Espy-Wilson, Carol
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.09489
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915490155200512
author Tabatabaee, Saba
Boyce, Suzanne
Oren, Liran
Tiede, Mark
Espy-Wilson, Carol
author_facet Tabatabaee, Saba
Boyce, Suzanne
Oren, Liran
Tiede, Mark
Espy-Wilson, Carol
contents Traditional clinical approaches for assessing nasality, such as nasopharyngoscopy and nasometry, involve unpleasant experiences and are problematic for children. Speech Inversion (SI), a noninvasive technique, offers a promising alternative for estimating articulatory movement without the need for physical instrumentation. In this study, an SI system trained on nasalance data from healthy adults is augmented with source information from electroglottography and acoustically derived F0, periodic and aperiodic energy estimates as proxies for glottal control. This model achieves 16.92% relative improvement in Pearson Product-Moment Correlation (PPMC) compared to a previous SI system for nasalance estimation. To adapt the SI system for nasalance estimation in children with Velopharyngeal Insufficiency (VPI), the model initially trained on adult speech was fine-tuned using children with VPI data, yielding an 7.90% relative improvement in PPMC compared to its performance before fine-tuning.
format Preprint
id arxiv_https___arxiv_org_abs_2509_09489
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Acoustic to Articulatory Speech Inversion for Children with Velopharyngeal Insufficiency
Tabatabaee, Saba
Boyce, Suzanne
Oren, Liran
Tiede, Mark
Espy-Wilson, Carol
Audio and Speech Processing
Traditional clinical approaches for assessing nasality, such as nasopharyngoscopy and nasometry, involve unpleasant experiences and are problematic for children. Speech Inversion (SI), a noninvasive technique, offers a promising alternative for estimating articulatory movement without the need for physical instrumentation. In this study, an SI system trained on nasalance data from healthy adults is augmented with source information from electroglottography and acoustically derived F0, periodic and aperiodic energy estimates as proxies for glottal control. This model achieves 16.92% relative improvement in Pearson Product-Moment Correlation (PPMC) compared to a previous SI system for nasalance estimation. To adapt the SI system for nasalance estimation in children with Velopharyngeal Insufficiency (VPI), the model initially trained on adult speech was fine-tuned using children with VPI data, yielding an 7.90% relative improvement in PPMC compared to its performance before fine-tuning.
title Acoustic to Articulatory Speech Inversion for Children with Velopharyngeal Insufficiency
topic Audio and Speech Processing
url https://arxiv.org/abs/2509.09489