Saved in:
| Main Authors: | Lin, Yi-Cheng, Chou, Huang-Cheng, Wei, Tzu-Chieh, Chen, Kuan-Yu, Lee, Hung-yi |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.13989 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Gender Bias in Instruction-Guided Speech Synthesis Models
by: Kuan, Chun-Yi, et al.
Published: (2025)
by: Kuan, Chun-Yi, et al.
Published: (2025)
Speech-IFEval: Evaluating Instruction-Following and Quantifying Catastrophic Forgetting in Speech-Aware Language Models
by: Lu, Ke-Han, et al.
Published: (2025)
by: Lu, Ke-Han, et al.
Published: (2025)
Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples
by: Kuan, Chun-Yi, et al.
Published: (2025)
by: Kuan, Chun-Yi, et al.
Published: (2025)
Toward Fair Speech Technologies: A Comprehensive Survey of Bias and Fairness in Speech AI
by: Lin, Yi-Cheng, et al.
Published: (2026)
by: Lin, Yi-Cheng, et al.
Published: (2026)
Mitigating Subgroup Disparities in Multi-Label Speech Emotion Recognition: A Pseudo-Labeling and Unsupervised Learning Approach
by: Lin, Yi-Cheng, et al.
Published: (2025)
by: Lin, Yi-Cheng, et al.
Published: (2025)
Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning
by: Kuan, Chun-Yi, et al.
Published: (2024)
by: Kuan, Chun-Yi, et al.
Published: (2024)
Emo-bias: A Large Scale Evaluation of Social Bias on Speech Emotion Recognition
by: Lin, Yi-Cheng, et al.
Published: (2024)
by: Lin, Yi-Cheng, et al.
Published: (2024)
Improving Speech Emotion Recognition in Under-Resourced Languages via Speech-to-Speech Translation with Bootstrapping Data Selection
by: Lin, Hsi-Che, et al.
Published: (2024)
by: Lin, Hsi-Che, et al.
Published: (2024)
CO-VADA: A Confidence-Oriented Voice Augmentation Debiasing Approach for Fair Speech Emotion Recognition
by: Tsai, Yun-Shao, et al.
Published: (2025)
by: Tsai, Yun-Shao, et al.
Published: (2025)
EMO-Debias: Benchmarking Gender Debiasing Techniques in Multi-Label Speech Emotion Recognition
by: Lin, Yi-Cheng, et al.
Published: (2025)
by: Lin, Yi-Cheng, et al.
Published: (2025)
Towards General-Purpose Text-Instruction-Guided Voice Conversion
by: Kuan, Chun-Yi, et al.
Published: (2023)
by: Kuan, Chun-Yi, et al.
Published: (2023)
Meta-PerSER: Few-Shot Listener Personalized Speech Emotion Recognition via Meta-learning
by: Shen, Liang-Yeh, et al.
Published: (2025)
by: Shen, Liang-Yeh, et al.
Published: (2025)
Identifying Speaker Information in Feed-Forward Layers of Self-Supervised Speech Transformers
by: Lin, Tzu-Quan, et al.
Published: (2025)
by: Lin, Tzu-Quan, et al.
Published: (2025)
TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling
by: Tseng, Liang-Hsuan, et al.
Published: (2025)
by: Tseng, Liang-Hsuan, et al.
Published: (2025)
Multi-Distillation from Speech and Music Representation Models
by: Wei, Jui-Chiang, et al.
Published: (2025)
by: Wei, Jui-Chiang, et al.
Published: (2025)
The False Resonance: A Critical Examination of Emotion Embedding Similarity for Speech Generation Evaluation
by: Tsai, Yun-Shao, et al.
Published: (2026)
by: Tsai, Yun-Shao, et al.
Published: (2026)
Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models
by: Lin, Yi-Cheng, et al.
Published: (2024)
by: Lin, Yi-Cheng, et al.
Published: (2024)
Stimulus Modality Matters: Impact of Perceptual Evaluations from Different Modalities on Speech Emotion Recognition System Performance
by: Chou, Huang-Cheng, et al.
Published: (2024)
by: Chou, Huang-Cheng, et al.
Published: (2024)
SpeechCaps: Advancing Instruction-Based Universal Speech Models with Multi-Talker Speaking Style Captioning
by: Huang, Chien-yu, et al.
Published: (2024)
by: Huang, Chien-yu, et al.
Published: (2024)
DeSTA2: Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
by: Lu, Ke-Han, et al.
Published: (2024)
by: Lu, Ke-Han, et al.
Published: (2024)
Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models
by: Lin, Yi-Cheng, et al.
Published: (2024)
by: Lin, Yi-Cheng, et al.
Published: (2024)
Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper
by: Yang, Chih-Kai, et al.
Published: (2024)
by: Yang, Chih-Kai, et al.
Published: (2024)
How Contrastive Decoding Enhances Large Audio Language Models?
by: Lin, Tzu-Quan, et al.
Published: (2026)
by: Lin, Tzu-Quan, et al.
Published: (2026)
DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models
by: Lin, Tzu-Quan, et al.
Published: (2024)
by: Lin, Tzu-Quan, et al.
Published: (2024)
Zero Resource Code-switched Speech Benchmark Using Speech Utterance Pairs For Multiple Spoken Languages
by: Huang, Kuan-Po, et al.
Published: (2023)
by: Huang, Kuan-Po, et al.
Published: (2023)
On the social bias of speech self-supervised models
by: Lin, Yi-Cheng, et al.
Published: (2024)
by: Lin, Yi-Cheng, et al.
Published: (2024)
Reducing Object Hallucination in Large Audio-Language Models via Audio-Aware Decoding
by: Hsu, Tzu-wen, et al.
Published: (2025)
by: Hsu, Tzu-wen, et al.
Published: (2025)
VIBE: Voice-Induced open-ended Bias Evaluation for Large Audio-Language Models via Real-World Speech
by: Lin, Yi-Cheng, et al.
Published: (2026)
by: Lin, Yi-Cheng, et al.
Published: (2026)
Investigating the Effects of Large-Scale Pseudo-Stereo Data and Different Speech Foundation Model on Dialogue Generative Spoken Language Model
by: Fu, Yu-Kuan, et al.
Published: (2024)
by: Fu, Yu-Kuan, et al.
Published: (2024)
Pseudo2Real: Task Arithmetic for Pseudo-Label Correction in Automatic Speech Recognition
by: Lin, Yi-Cheng, et al.
Published: (2025)
by: Lin, Yi-Cheng, et al.
Published: (2025)
Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
by: Kuan, Chun-Yi, et al.
Published: (2024)
by: Kuan, Chun-Yi, et al.
Published: (2024)
Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
by: Huang, Chien-yu, et al.
Published: (2023)
by: Huang, Chien-yu, et al.
Published: (2023)
EMO-Codec: An In-Depth Look at Emotion Preservation capacity of Legacy and Neural Codec Models With Subjective and Objective Evaluations
by: Ren, Wenze, et al.
Published: (2024)
by: Ren, Wenze, et al.
Published: (2024)
SMILE: Speech Meta In-Context Learning for Low-Resource Language Automatic Speech Recognition
by: Hsu, Ming-Hao, et al.
Published: (2024)
by: Hsu, Ming-Hao, et al.
Published: (2024)
Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
by: Lin, Guan-Ting, et al.
Published: (2024)
by: Lin, Guan-Ting, et al.
Published: (2024)
AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering
by: Kuan, Chun-Yi, et al.
Published: (2026)
by: Kuan, Chun-Yi, et al.
Published: (2026)
From Alignment to Advancement: Bootstrapping Audio-Language Alignment with Synthetic Data
by: Kuan, Chun-Yi, et al.
Published: (2025)
by: Kuan, Chun-Yi, et al.
Published: (2025)
Property Neurons in Self-Supervised Speech Transformers
by: Lin, Tzu-Quan, et al.
Published: (2024)
by: Lin, Tzu-Quan, et al.
Published: (2024)
AQAScore: Evaluating Semantic Alignment in Text-to-Audio Generation via Audio Question Answering
by: Kuan, Chun-Yi, et al.
Published: (2026)
by: Kuan, Chun-Yi, et al.
Published: (2026)
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision
by: Yang, Chih-Kai, et al.
Published: (2023)
by: Yang, Chih-Kai, et al.
Published: (2023)
Similar Items
-
Gender Bias in Instruction-Guided Speech Synthesis Models
by: Kuan, Chun-Yi, et al.
Published: (2025) -
Speech-IFEval: Evaluating Instruction-Following and Quantifying Catastrophic Forgetting in Speech-Aware Language Models
by: Lu, Ke-Han, et al.
Published: (2025) -
Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples
by: Kuan, Chun-Yi, et al.
Published: (2025) -
Toward Fair Speech Technologies: A Comprehensive Survey of Bias and Fairness in Speech AI
by: Lin, Yi-Cheng, et al.
Published: (2026) -
Mitigating Subgroup Disparities in Multi-Label Speech Emotion Recognition: A Pseudo-Labeling and Unsupervised Learning Approach
by: Lin, Yi-Cheng, et al.
Published: (2025)