Saved in:
| Main Authors: | de Oliveira, Danilo, Peer, Tal, Rochdi, Jonas, Gerkmann, Timo |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.21317 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models
by: Richter, Julius, et al.
Published: (2025)
by: Richter, Julius, et al.
Published: (2025)
Investigating Training Objectives for Generative Speech Enhancement
by: Richter, Julius, et al.
Published: (2024)
by: Richter, Julius, et al.
Published: (2024)
Do We Need EMA for Diffusion-Based Speech Enhancement? Toward a Magnitude-Preserving Network Architecture
by: Richter, Julius, et al.
Published: (2025)
by: Richter, Julius, et al.
Published: (2025)
Real-Time Streaming Mel Vocoding with Generative Flow Matching
by: Welker, Simon, et al.
Published: (2025)
by: Welker, Simon, et al.
Published: (2025)
Too Good to Be True: A Study on Modern Automatic Speech Recognition for the Evaluation of Speech Enhancement
by: de Oliveira, Danilo, et al.
Published: (2026)
by: de Oliveira, Danilo, et al.
Published: (2026)
The PESQetarian: On the Relevance of Goodhart's Law for Speech Enhancement
by: de Oliveira, Danilo, et al.
Published: (2024)
by: de Oliveira, Danilo, et al.
Published: (2024)
Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech
by: de Oliveira, Danilo, et al.
Published: (2024)
by: de Oliveira, Danilo, et al.
Published: (2024)
An Analysis of the Variance of Diffusion-based Speech Enhancement
by: Lay, Bunlong, et al.
Published: (2024)
by: Lay, Bunlong, et al.
Published: (2024)
Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters
by: Tesch, Kristina, et al.
Published: (2023)
by: Tesch, Kristina, et al.
Published: (2023)
Bone-conduction Guided Multimodal Speech Enhancement with Conditional Diffusion Models
by: Khanagha, Sina, et al.
Published: (2026)
by: Khanagha, Sina, et al.
Published: (2026)
Are Modern Speech Enhancement Systems Vulnerable to Adversarial Attacks?
by: Makarov, Rostislav, et al.
Published: (2025)
by: Makarov, Rostislav, et al.
Published: (2025)
Speech Enhancement and Dereverberation with Diffusion-based Generative Models
by: Richter, Julius, et al.
Published: (2022)
by: Richter, Julius, et al.
Published: (2022)
Diffusion Buffer for Online Generative Speech Enhancement
by: Lay, Bunlong, et al.
Published: (2025)
by: Lay, Bunlong, et al.
Published: (2025)
ReverbFX: A Dataset of Room Impulse Responses Derived from Reverb Effect Plugins for Singing Voice Dereverberation
by: Richter, Julius, et al.
Published: (2025)
by: Richter, Julius, et al.
Published: (2025)
StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation
by: Lemercier, Jean-Marie, et al.
Published: (2022)
by: Lemercier, Jean-Marie, et al.
Published: (2022)
Steering Deep Non-Linear Spatially Selective Filters for Weakly Guided Extraction of Moving Speakers in Dynamic Scenarios
by: Kienegger, Jakob, et al.
Published: (2025)
by: Kienegger, Jakob, et al.
Published: (2025)
Autoregressive Guidance of Deep Spatially Selective Filters using Bayesian Tracking for Efficient Extraction of Moving Speakers
by: Kienegger, Jakob, et al.
Published: (2026)
by: Kienegger, Jakob, et al.
Published: (2026)
Adaptive Rotary Steering with Joint Autoregression for Robust Extraction of Closely Moving Speakers in Dynamic Scenarios
by: Kienegger, Jakob, et al.
Published: (2026)
by: Kienegger, Jakob, et al.
Published: (2026)
EMOCONV-DIFF: Diffusion-based Speech Emotion Conversion for Non-parallel and In-the-wild Data
by: Prabhu, Navin Raj, et al.
Published: (2023)
by: Prabhu, Navin Raj, et al.
Published: (2023)
Single and Few-step Diffusion for Generative Speech Enhancement
by: Lay, Bunlong, et al.
Published: (2023)
by: Lay, Bunlong, et al.
Published: (2023)
Mask-Weighted Spatial Likelihood Coding for Speaker-Independent Joint Localization and Mask Estimation
by: Kienegger, Jakob, et al.
Published: (2024)
by: Kienegger, Jakob, et al.
Published: (2024)
Wind Noise Reduction with a Diffusion-based Stochastic Regeneration Model
by: Lemercier, Jean-Marie, et al.
Published: (2023)
by: Lemercier, Jean-Marie, et al.
Published: (2023)
EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation
by: Richter, Julius, et al.
Published: (2024)
by: Richter, Julius, et al.
Published: (2024)
Self-Steering Deep Non-Linear Spatially Selective Filters for Efficient Extraction of Moving Speakers under Weak Guidance
by: Kienegger, Jakob, et al.
Published: (2025)
by: Kienegger, Jakob, et al.
Published: (2025)
BUDDy: Single-Channel Blind Unsupervised Dereverberation with Diffusion Models
by: Moliner, Eloi, et al.
Published: (2024)
by: Moliner, Eloi, et al.
Published: (2024)
Unsupervised Blind Joint Dereverberation and Room Acoustics Estimation with Diffusion Models
by: Lemercier, Jean-Marie, et al.
Published: (2024)
by: Lemercier, Jean-Marie, et al.
Published: (2024)
The Voice Behind the Words: Quantifying Intersectional Bias in SpeechLLMs
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2026)
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2026)
Gibberish is All You Need for Membership Inference Detection in Contrastive Language-Audio Pretraining
by: Cheng, Ruoxi, et al.
Published: (2024)
by: Cheng, Ruoxi, et al.
Published: (2024)
Enhancing In-the-Wild Speech Emotion Conversion with Resynthesis-based Duration Modeling
by: Prabhu, Navin Raj, et al.
Published: (2025)
by: Prabhu, Navin Raj, et al.
Published: (2025)
Diffusion Models for Audio Restoration
by: Lemercier, Jean-Marie, et al.
Published: (2024)
by: Lemercier, Jean-Marie, et al.
Published: (2024)
HRTF Estimation using a Score-based Prior
by: Thuillier, Etienne, et al.
Published: (2024)
by: Thuillier, Etienne, et al.
Published: (2024)
Rare Word Recognition and Translation Without Fine-Tuning via Task Vector in Speech Models
by: Jing, Ruihao, et al.
Published: (2025)
by: Jing, Ruihao, et al.
Published: (2025)
Can LLMs Help Localize Fake Words in Partially Fake Speech?
by: Zhang, Lin, et al.
Published: (2026)
by: Zhang, Lin, et al.
Published: (2026)
Integrating Pause Information with Word Embeddings in Language Models for Alzheimer's Disease Detection from Spontaneous Speech
by: Pu, Yu, et al.
Published: (2025)
by: Pu, Yu, et al.
Published: (2025)
A Fast Solver for Interpolating Stochastic Differential Equation Diffusion Models for Speech Restoration
by: Lay, Bunlong, et al.
Published: (2026)
by: Lay, Bunlong, et al.
Published: (2026)
Word-Level Emotional Expression Control in Zero-Shot Text-to-Speech Synthesis
by: Wang, Tianrui, et al.
Published: (2025)
by: Wang, Tianrui, et al.
Published: (2025)
Quantifying Dimensional Independence in Speech: An Information-Theoretic Framework for Disentangled Representation Learning
by: Kashyap, Bipasha, et al.
Published: (2026)
by: Kashyap, Bipasha, et al.
Published: (2026)
Word Level Timestamp Generation for Automatic Speech Recognition and Translation
by: Hu, Ke, et al.
Published: (2025)
by: Hu, Ke, et al.
Published: (2025)
Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration
by: Ku, Pin-Jui, et al.
Published: (2024)
by: Ku, Pin-Jui, et al.
Published: (2024)
Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation
by: Li, Jiaqi, et al.
Published: (2024)
by: Li, Jiaqi, et al.
Published: (2024)
Similar Items
-
LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models
by: Richter, Julius, et al.
Published: (2025) -
Investigating Training Objectives for Generative Speech Enhancement
by: Richter, Julius, et al.
Published: (2024) -
Do We Need EMA for Diffusion-Based Speech Enhancement? Toward a Magnitude-Preserving Network Architecture
by: Richter, Julius, et al.
Published: (2025) -
Real-Time Streaming Mel Vocoding with Generative Flow Matching
by: Welker, Simon, et al.
Published: (2025) -
Too Good to Be True: A Study on Modern Automatic Speech Recognition for the Evaluation of Speech Enhancement
by: de Oliveira, Danilo, et al.
Published: (2026)