Saved in:
| Main Authors: | de Oliveira, Danilo, Peer, Tal, Gerkmann, Timo |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.12107 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Are These Even Words? Quantifying the Gibberishness of Generative Speech Models
by: de Oliveira, Danilo, et al.
Published: (2025)
by: de Oliveira, Danilo, et al.
Published: (2025)
LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models
by: Richter, Julius, et al.
Published: (2025)
by: Richter, Julius, et al.
Published: (2025)
Investigating Training Objectives for Generative Speech Enhancement
by: Richter, Julius, et al.
Published: (2024)
by: Richter, Julius, et al.
Published: (2024)
Do We Need EMA for Diffusion-Based Speech Enhancement? Toward a Magnitude-Preserving Network Architecture
by: Richter, Julius, et al.
Published: (2025)
by: Richter, Julius, et al.
Published: (2025)
The PESQetarian: On the Relevance of Goodhart's Law for Speech Enhancement
by: de Oliveira, Danilo, et al.
Published: (2024)
by: de Oliveira, Danilo, et al.
Published: (2024)
Are Modern Speech Enhancement Systems Vulnerable to Adversarial Attacks?
by: Makarov, Rostislav, et al.
Published: (2025)
by: Makarov, Rostislav, et al.
Published: (2025)
An Analysis of the Variance of Diffusion-based Speech Enhancement
by: Lay, Bunlong, et al.
Published: (2024)
by: Lay, Bunlong, et al.
Published: (2024)
Enhancing In-the-Wild Speech Emotion Conversion with Resynthesis-based Duration Modeling
by: Prabhu, Navin Raj, et al.
Published: (2025)
by: Prabhu, Navin Raj, et al.
Published: (2025)
Real-Time Streaming Mel Vocoding with Generative Flow Matching
by: Welker, Simon, et al.
Published: (2025)
by: Welker, Simon, et al.
Published: (2025)
A Fast Solver for Interpolating Stochastic Differential Equation Diffusion Models for Speech Restoration
by: Lay, Bunlong, et al.
Published: (2026)
by: Lay, Bunlong, et al.
Published: (2026)
Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech
by: de Oliveira, Danilo, et al.
Published: (2024)
by: de Oliveira, Danilo, et al.
Published: (2024)
Diffusion Buffer: Online Diffusion-based Speech Enhancement with Sub-Second Latency
by: Lay, Bunlong, et al.
Published: (2025)
by: Lay, Bunlong, et al.
Published: (2025)
Bone-conduction Guided Multimodal Speech Enhancement with Conditional Diffusion Models
by: Khanagha, Sina, et al.
Published: (2026)
by: Khanagha, Sina, et al.
Published: (2026)
Diffusion Buffer for Online Generative Speech Enhancement
by: Lay, Bunlong, et al.
Published: (2025)
by: Lay, Bunlong, et al.
Published: (2025)
Speech Enhancement and Dereverberation with Diffusion-based Generative Models
by: Richter, Julius, et al.
Published: (2022)
by: Richter, Julius, et al.
Published: (2022)
Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters
by: Tesch, Kristina, et al.
Published: (2023)
by: Tesch, Kristina, et al.
Published: (2023)
StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation
by: Lemercier, Jean-Marie, et al.
Published: (2022)
by: Lemercier, Jean-Marie, et al.
Published: (2022)
Single and Few-step Diffusion for Generative Speech Enhancement
by: Lay, Bunlong, et al.
Published: (2023)
by: Lay, Bunlong, et al.
Published: (2023)
EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation
by: Richter, Julius, et al.
Published: (2024)
by: Richter, Julius, et al.
Published: (2024)
Robustness of Speech Separation Models for Similar-pitch Speakers
by: Lay, Bunlong, et al.
Published: (2024)
by: Lay, Bunlong, et al.
Published: (2024)
A Semi-spontaneous Dutch Speech Dataset for Speech Enhancement and Speech Recognition
by: de Groot, Dimme, et al.
Published: (2026)
by: de Groot, Dimme, et al.
Published: (2026)
Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation
by: Nayeem, Md., et al.
Published: (2025)
by: Nayeem, Md., et al.
Published: (2025)
Fairness of Automatic Speech Recognition in Cleft Lip and Palate Speech
by: Bhattacharjee, Susmita, et al.
Published: (2025)
by: Bhattacharjee, Susmita, et al.
Published: (2025)
Latent-Level Enhancement with Flow Matching for Robust Automatic Speech Recognition
by: Yang, Da-Hee, et al.
Published: (2026)
by: Yang, Da-Hee, et al.
Published: (2026)
Speaker Attributed Automatic Speech Recognition Using Speech Aware LLMS
by: Aronowitz, Hagai, et al.
Published: (2026)
by: Aronowitz, Hagai, et al.
Published: (2026)
EMOCONV-DIFF: Diffusion-based Speech Emotion Conversion for Non-parallel and In-the-wild Data
by: Prabhu, Navin Raj, et al.
Published: (2023)
by: Prabhu, Navin Raj, et al.
Published: (2023)
Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
by: Chen, Peikun, et al.
Published: (2024)
by: Chen, Peikun, et al.
Published: (2024)
Robust Speech Recognition with Schrödinger Bridge-Based Speech Enhancement
by: Nasretdinov, Rauf, et al.
Published: (2025)
by: Nasretdinov, Rauf, et al.
Published: (2025)
SpeechColab Leaderboard: An Open-Source Platform for Automatic Speech Recognition Evaluation
by: Du, Jiayu, et al.
Published: (2024)
by: Du, Jiayu, et al.
Published: (2024)
The RoyalFlush Automatic Speech Diarization and Recognition System for In-Car Multi-Channel Automatic Speech Recognition Challenge
by: Tian, Jingguang, et al.
Published: (2024)
by: Tian, Jingguang, et al.
Published: (2024)
An Analysis of Joint Nonlinear Spatial Filtering for Spatial Aliasing Reduction
by: Mannanova, Alina, et al.
Published: (2025)
by: Mannanova, Alina, et al.
Published: (2025)
Using Songs to Improve Kazakh Automatic Speech Recognition
by: Yeshpanov, Rustem
Published: (2026)
by: Yeshpanov, Rustem
Published: (2026)
Unsupervised Online Continual Learning for Automatic Speech Recognition
by: Eeckt, Steven Vander, et al.
Published: (2024)
by: Eeckt, Steven Vander, et al.
Published: (2024)
Non-Intrusive Automatic Speech Recognition Refinement: A Survey
by: Peyghan, Mohammad Reza, et al.
Published: (2025)
by: Peyghan, Mohammad Reza, et al.
Published: (2025)
Objective and Subjective Evaluation of Diffusion-Based Speech Enhancement for Dysarthric Speech
by: de Groot, Dimme, et al.
Published: (2025)
by: de Groot, Dimme, et al.
Published: (2025)
Zero Shot Text to Speech Augmentation for Automatic Speech Recognition on Low-Resource Accented Speech Corpora
by: Nespoli, Francesco, et al.
Published: (2024)
by: Nespoli, Francesco, et al.
Published: (2024)
ReverbFX: A Dataset of Room Impulse Responses Derived from Reverb Effect Plugins for Singing Voice Dereverberation
by: Richter, Julius, et al.
Published: (2025)
by: Richter, Julius, et al.
Published: (2025)
A Large Dataset of Spontaneous Speech with the Accent Spoken in São Paulo for Automatic Speech Recognition Evaluation
by: Lima, Rodrigo, et al.
Published: (2024)
by: Lima, Rodrigo, et al.
Published: (2024)
Disentangled-Transformer: An Explainable End-to-End Automatic Speech Recognition Model with Speech Content-Context Separation
by: Wang, Pu, et al.
Published: (2024)
by: Wang, Pu, et al.
Published: (2024)
Automatic Speech Recognition for Hindi
by: Saha, Anish, et al.
Published: (2024)
by: Saha, Anish, et al.
Published: (2024)
Similar Items
-
Are These Even Words? Quantifying the Gibberishness of Generative Speech Models
by: de Oliveira, Danilo, et al.
Published: (2025) -
LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models
by: Richter, Julius, et al.
Published: (2025) -
Investigating Training Objectives for Generative Speech Enhancement
by: Richter, Julius, et al.
Published: (2024) -
Do We Need EMA for Diffusion-Based Speech Enhancement? Toward a Magnitude-Preserving Network Architecture
by: Richter, Julius, et al.
Published: (2025) -
The PESQetarian: On the Relevance of Goodhart's Law for Speech Enhancement
by: de Oliveira, Danilo, et al.
Published: (2024)