Saved in:
| Main Authors: | Sun, Anchen, Londono, Juan J, Elbaum, Batya, Estrada, Luis, Lazo, Roberto Jose, Vitale, Laura, Villasanti, Hugo Gonzalez, Fusaroli, Riccardo, Perry, Lynn K, Messinger, Daniel S |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.07342 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Who Said What WSW 2.0? Enhanced Automated Analysis of Preschool Classroom Speech
by: Sun, Anchen, et al.
Published: (2025)
by: Sun, Anchen, et al.
Published: (2025)
From Who Said What to Who They Are: Modular Training-free Identity-Aware LLM Refinement of Speaker Diarization
by: Chen, Yu-Wen, et al.
Published: (2025)
by: Chen, Yu-Wen, et al.
Published: (2025)
VoxSafeBench: Not Just What Is Said, but Who, How, and Where
by: Wang, Yuxiang, et al.
Published: (2026)
by: Wang, Yuxiang, et al.
Published: (2026)
Who is Speaking or Who is Depressed? A Controlled Study of Speaker Leakage in Speech-Based Depression Detection
by: Yeh, Hsiang-Chen, et al.
Published: (2026)
by: Yeh, Hsiang-Chen, et al.
Published: (2026)
Disentangled-Transformer: An Explainable End-to-End Automatic Speech Recognition Model with Speech Content-Context Separation
by: Wang, Pu, et al.
Published: (2024)
by: Wang, Pu, et al.
Published: (2024)
Multitask Learning with Capsule Networks for Speech-to-Intent Applications
by: Poncelet, Jakob, et al.
Published: (2020)
by: Poncelet, Jakob, et al.
Published: (2020)
Unsupervised Online Continual Learning for Automatic Speech Recognition
by: Eeckt, Steven Vander, et al.
Published: (2024)
by: Eeckt, Steven Vander, et al.
Published: (2024)
Speech Recognition for Automatically Assessing Afrikaans and isiXhosa Preschool Oral Narratives
by: Jacobs, Christiaan, et al.
Published: (2025)
by: Jacobs, Christiaan, et al.
Published: (2025)
SNIPER Training: Single-Shot Sparse Training for Text-to-Speech
by: Lam, Perry, et al.
Published: (2022)
by: Lam, Perry, et al.
Published: (2022)
Using Adapters to Overcome Catastrophic Forgetting in End-to-End Automatic Speech Recognition
by: Eeckt, Steven Vander, et al.
Published: (2022)
by: Eeckt, Steven Vander, et al.
Published: (2022)
Analyzing and Improving Speaker Similarity Assessment for Speech Synthesis
by: Carbonneau, Marc-André, et al.
Published: (2025)
by: Carbonneau, Marc-André, et al.
Published: (2025)
Analyzing the Impact of Accent on English Speech: Acoustic and Articulatory Perspectives
by: Premananth, Gowtham, et al.
Published: (2025)
by: Premananth, Gowtham, et al.
Published: (2025)
Comparison of Self-Supervised Speech Pre-Training Methods on Flemish Dutch
by: Poncelet, Jakob, et al.
Published: (2021)
by: Poncelet, Jakob, et al.
Published: (2021)
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling
by: Poncelet, Jakob, et al.
Published: (2025)
by: Poncelet, Jakob, et al.
Published: (2025)
Rehearsal-Free Online Continual Learning for Automatic Speech Recognition
by: Eeckt, Steven Vander, et al.
Published: (2023)
by: Eeckt, Steven Vander, et al.
Published: (2023)
Beyond Manual Transcripts: The Potential of Automated Speech Recognition Errors in Improving Alzheimer's Disease Detection
by: Liu, Yin-Long, et al.
Published: (2025)
by: Liu, Yin-Long, et al.
Published: (2025)
Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units
by: Poncelet, Jakob, et al.
Published: (2023)
by: Poncelet, Jakob, et al.
Published: (2023)
Analyzing Speech Unit Selection for Textless Speech-to-Speech Translation
by: Duret, Jarod, et al.
Published: (2024)
by: Duret, Jarod, et al.
Published: (2024)
Efficient Extraction of Noise-Robust Discrete Units from Self-Supervised Speech Models
by: Poncelet, Jakob, et al.
Published: (2024)
by: Poncelet, Jakob, et al.
Published: (2024)
What Do Speech Foundation Models Not Learn About Speech?
by: Waheed, Abdul, et al.
Published: (2024)
by: Waheed, Abdul, et al.
Published: (2024)
Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora
by: Suda, Hitoshi, et al.
Published: (2025)
by: Suda, Hitoshi, et al.
Published: (2025)
TellWhisper: Tell Whisper Who Speaks When
by: Hu, Yifan, et al.
Published: (2026)
by: Hu, Yifan, et al.
Published: (2026)
EmoSLLM: Parameter-Efficient Adaptation of LLMs for Speech Emotion Recognition
by: Thimonier, Hugo, et al.
Published: (2025)
by: Thimonier, Hugo, et al.
Published: (2025)
RealClass: A Framework for Classroom Speech Simulation with Public Datasets and Game Engines
by: Attia, Ahmed Adel, et al.
Published: (2025)
by: Attia, Ahmed Adel, et al.
Published: (2025)
Textless Streaming Speech-to-Speech Translation using Semantic Speech Tokens
by: Zhao, Jinzheng, et al.
Published: (2024)
by: Zhao, Jinzheng, et al.
Published: (2024)
Do You Hear What I Mean? Quantifying the Instruction-Perception Gap in Instruction-Guided Expressive Text-To-Speech Systems
by: Lin, Yi-Cheng, et al.
Published: (2025)
by: Lin, Yi-Cheng, et al.
Published: (2025)
AS-Speech: Adaptive Style For Speech Synthesis
by: Li, Zhipeng, et al.
Published: (2024)
by: Li, Zhipeng, et al.
Published: (2024)
What do Speech Foundation Models Learn? Analysis and Applications
by: Pasad, Ankita
Published: (2025)
by: Pasad, Ankita
Published: (2025)
A Semi-spontaneous Dutch Speech Dataset for Speech Enhancement and Speech Recognition
by: de Groot, Dimme, et al.
Published: (2026)
by: de Groot, Dimme, et al.
Published: (2026)
Scalable Speech Enhancement with Dynamic Channel Pruning
by: Miccini, Riccardo, et al.
Published: (2024)
by: Miccini, Riccardo, et al.
Published: (2024)
Speech Quality-Based Localization of Low-Quality Speech and Text-to-Speech Synthesis Artefacts
by: Kuhlmann, Michael, et al.
Published: (2026)
by: Kuhlmann, Michael, et al.
Published: (2026)
Psychophysiology-aided Perceptually Fluent Speech Analysis of Children Who Stutter
by: Xiao, Yi, et al.
Published: (2022)
by: Xiao, Yi, et al.
Published: (2022)
Who Spoke What When? Evaluating Spoken Language Models for Conversational ASR with Semantic and Overlap-Aware Metrics
by: Tawara, Naohiro, et al.
Published: (2026)
by: Tawara, Naohiro, et al.
Published: (2026)
Influence of Clean Speech Characteristics on Speech Enhancement Performance
by: Hou, Mingchi, et al.
Published: (2025)
by: Hou, Mingchi, et al.
Published: (2025)
What do neural networks listen to? Exploring the crucial bands in Speech Enhancement using Sinc-convolution
by: Ho, Kuan-Hsun, et al.
Published: (2024)
by: Ho, Kuan-Hsun, et al.
Published: (2024)
Analyzing the Impact of Splicing Artifacts in Partially Fake Speech Signals
by: Negroni, Viola, et al.
Published: (2024)
by: Negroni, Viola, et al.
Published: (2024)
Expanding and Analyzing ODAQ -- the Open Dataset of Audio Quality
by: Dick, Sascha, et al.
Published: (2025)
by: Dick, Sascha, et al.
Published: (2025)
Adaptive Slimming for Scalable and Efficient Speech Enhancement
by: Miccini, Riccardo, et al.
Published: (2025)
by: Miccini, Riccardo, et al.
Published: (2025)
Fairness of Automatic Speech Recognition in Cleft Lip and Palate Speech
by: Bhattacharjee, Susmita, et al.
Published: (2025)
by: Bhattacharjee, Susmita, et al.
Published: (2025)
Assessing the Impact of Noise and Speech Enhancement on the Intelligibility of Speech Codecs
by: Behringer, Lyonel, et al.
Published: (2026)
by: Behringer, Lyonel, et al.
Published: (2026)
Similar Items
-
Who Said What WSW 2.0? Enhanced Automated Analysis of Preschool Classroom Speech
by: Sun, Anchen, et al.
Published: (2025) -
From Who Said What to Who They Are: Modular Training-free Identity-Aware LLM Refinement of Speaker Diarization
by: Chen, Yu-Wen, et al.
Published: (2025) -
VoxSafeBench: Not Just What Is Said, but Who, How, and Where
by: Wang, Yuxiang, et al.
Published: (2026) -
Who is Speaking or Who is Depressed? A Controlled Study of Speaker Leakage in Speech-Based Depression Detection
by: Yeh, Hsiang-Chen, et al.
Published: (2026) -
Disentangled-Transformer: An Explainable End-to-End Automatic Speech Recognition Model with Speech Content-Context Separation
by: Wang, Pu, et al.
Published: (2024)