:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sun, Anchen, Londono, Juan J, Elbaum, Batya, Estrada, Luis, Lazo, Roberto Jose, Vitale, Laura, Villasanti, Hugo Gonzalez, Fusaroli, Riccardo, Perry, Lynn K, Messinger, Daniel S
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing Machine Learning
Online Access:	https://arxiv.org/abs/2401.07342
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Who Said What WSW 2.0? Enhanced Automated Analysis of Preschool Classroom Speech
by: Sun, Anchen, et al.
Published: (2025)

From Who Said What to Who They Are: Modular Training-free Identity-Aware LLM Refinement of Speaker Diarization
by: Chen, Yu-Wen, et al.
Published: (2025)

VoxSafeBench: Not Just What Is Said, but Who, How, and Where
by: Wang, Yuxiang, et al.
Published: (2026)

Who is Speaking or Who is Depressed? A Controlled Study of Speaker Leakage in Speech-Based Depression Detection
by: Yeh, Hsiang-Chen, et al.
Published: (2026)

Disentangled-Transformer: An Explainable End-to-End Automatic Speech Recognition Model with Speech Content-Context Separation
by: Wang, Pu, et al.
Published: (2024)

Multitask Learning with Capsule Networks for Speech-to-Intent Applications
by: Poncelet, Jakob, et al.
Published: (2020)

Unsupervised Online Continual Learning for Automatic Speech Recognition
by: Eeckt, Steven Vander, et al.
Published: (2024)

Speech Recognition for Automatically Assessing Afrikaans and isiXhosa Preschool Oral Narratives
by: Jacobs, Christiaan, et al.
Published: (2025)

SNIPER Training: Single-Shot Sparse Training for Text-to-Speech
by: Lam, Perry, et al.
Published: (2022)

Using Adapters to Overcome Catastrophic Forgetting in End-to-End Automatic Speech Recognition
by: Eeckt, Steven Vander, et al.
Published: (2022)

Analyzing and Improving Speaker Similarity Assessment for Speech Synthesis
by: Carbonneau, Marc-André, et al.
Published: (2025)

Analyzing the Impact of Accent on English Speech: Acoustic and Articulatory Perspectives
by: Premananth, Gowtham, et al.
Published: (2025)

Comparison of Self-Supervised Speech Pre-Training Methods on Flemish Dutch
by: Poncelet, Jakob, et al.
Published: (2021)

Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling
by: Poncelet, Jakob, et al.
Published: (2025)

Rehearsal-Free Online Continual Learning for Automatic Speech Recognition
by: Eeckt, Steven Vander, et al.
Published: (2023)

Beyond Manual Transcripts: The Potential of Automated Speech Recognition Errors in Improving Alzheimer's Disease Detection
by: Liu, Yin-Long, et al.
Published: (2025)

Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units
by: Poncelet, Jakob, et al.
Published: (2023)

Analyzing Speech Unit Selection for Textless Speech-to-Speech Translation
by: Duret, Jarod, et al.
Published: (2024)

Efficient Extraction of Noise-Robust Discrete Units from Self-Supervised Speech Models
by: Poncelet, Jakob, et al.
Published: (2024)

What Do Speech Foundation Models Not Learn About Speech?
by: Waheed, Abdul, et al.
Published: (2024)

Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora
by: Suda, Hitoshi, et al.
Published: (2025)

TellWhisper: Tell Whisper Who Speaks When
by: Hu, Yifan, et al.
Published: (2026)

EmoSLLM: Parameter-Efficient Adaptation of LLMs for Speech Emotion Recognition
by: Thimonier, Hugo, et al.
Published: (2025)

RealClass: A Framework for Classroom Speech Simulation with Public Datasets and Game Engines
by: Attia, Ahmed Adel, et al.
Published: (2025)

Textless Streaming Speech-to-Speech Translation using Semantic Speech Tokens
by: Zhao, Jinzheng, et al.
Published: (2024)

Do You Hear What I Mean? Quantifying the Instruction-Perception Gap in Instruction-Guided Expressive Text-To-Speech Systems
by: Lin, Yi-Cheng, et al.
Published: (2025)

AS-Speech: Adaptive Style For Speech Synthesis
by: Li, Zhipeng, et al.
Published: (2024)

What do Speech Foundation Models Learn? Analysis and Applications
by: Pasad, Ankita
Published: (2025)

A Semi-spontaneous Dutch Speech Dataset for Speech Enhancement and Speech Recognition
by: de Groot, Dimme, et al.
Published: (2026)

Scalable Speech Enhancement with Dynamic Channel Pruning
by: Miccini, Riccardo, et al.
Published: (2024)

Speech Quality-Based Localization of Low-Quality Speech and Text-to-Speech Synthesis Artefacts
by: Kuhlmann, Michael, et al.
Published: (2026)

Psychophysiology-aided Perceptually Fluent Speech Analysis of Children Who Stutter
by: Xiao, Yi, et al.
Published: (2022)

Who Spoke What When? Evaluating Spoken Language Models for Conversational ASR with Semantic and Overlap-Aware Metrics
by: Tawara, Naohiro, et al.
Published: (2026)

Influence of Clean Speech Characteristics on Speech Enhancement Performance
by: Hou, Mingchi, et al.
Published: (2025)

What do neural networks listen to? Exploring the crucial bands in Speech Enhancement using Sinc-convolution
by: Ho, Kuan-Hsun, et al.
Published: (2024)

Analyzing the Impact of Splicing Artifacts in Partially Fake Speech Signals
by: Negroni, Viola, et al.
Published: (2024)

Expanding and Analyzing ODAQ -- the Open Dataset of Audio Quality
by: Dick, Sascha, et al.
Published: (2025)

Adaptive Slimming for Scalable and Efficient Speech Enhancement
by: Miccini, Riccardo, et al.
Published: (2025)

Fairness of Automatic Speech Recognition in Cleft Lip and Palate Speech
by: Bhattacharjee, Susmita, et al.
Published: (2025)

Assessing the Impact of Noise and Speech Enhancement on the Intelligibility of Speech Codecs
by: Behringer, Lyonel, et al.
Published: (2026)