Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kuan, Chun-Yi, Lee, Hung-yi
Format: Preprint
Veröffentlicht: 2025
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2502.05649
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866913684583874560
author Kuan, Chun-Yi
Lee, Hung-yi
author_facet Kuan, Chun-Yi
Lee, Hung-yi
contents Recent advancements in controllable expressive speech synthesis, especially in text-to-speech (TTS) models, have allowed for the generation of speech with specific styles guided by textual descriptions, known as style prompts. While this development enhances the flexibility and naturalness of synthesized speech, there remains a significant gap in understanding how these models handle vague or abstract style prompts. This study investigates the potential gender bias in how models interpret occupation-related prompts, specifically examining their responses to instructions like "Act like a nurse". We explore whether these models exhibit tendencies to amplify gender stereotypes when interpreting such prompts. Our experimental results reveal the model's tendency to exhibit gender bias for certain occupations. Moreover, models of different sizes show varying degrees of this bias across these occupations.
format Preprint
id arxiv_https___arxiv_org_abs_2502_05649
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Gender Bias in Instruction-Guided Speech Synthesis Models
Kuan, Chun-Yi
Lee, Hung-yi
Computation and Language
Machine Learning
Audio and Speech Processing
Recent advancements in controllable expressive speech synthesis, especially in text-to-speech (TTS) models, have allowed for the generation of speech with specific styles guided by textual descriptions, known as style prompts. While this development enhances the flexibility and naturalness of synthesized speech, there remains a significant gap in understanding how these models handle vague or abstract style prompts. This study investigates the potential gender bias in how models interpret occupation-related prompts, specifically examining their responses to instructions like "Act like a nurse". We explore whether these models exhibit tendencies to amplify gender stereotypes when interpreting such prompts. Our experimental results reveal the model's tendency to exhibit gender bias for certain occupations. Moreover, models of different sizes show varying degrees of this bias across these occupations.
title Gender Bias in Instruction-Guided Speech Synthesis Models
topic Computation and Language
Machine Learning
Audio and Speech Processing
url https://arxiv.org/abs/2502.05649