Saved in:
| Main Authors: | Shi, Renzheng, Bär, Andreas, Sach, Marvin, Tirry, Wouter, Fingscheidt, Tim |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.11842 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ICASSP 2026 URGENT Speech Enhancement Challenge
by: Li, Chenda, et al.
Published: (2026)
by: Li, Chenda, et al.
Published: (2026)
Interspeech 2025 URGENT Speech Enhancement Challenge
by: Saijo, Kohei, et al.
Published: (2025)
by: Saijo, Kohei, et al.
Published: (2025)
DisContSE: Single-Step Diffusion Speech Enhancement Based on Joint Discrete and Continuous Embeddings
by: Fu, Yihui, et al.
Published: (2026)
by: Fu, Yihui, et al.
Published: (2026)
P.808 Multilingual Speech Enhancement Testing: Approach and Results of URGENT 2025 Challenge
by: Sach, Marvin, et al.
Published: (2025)
by: Sach, Marvin, et al.
Published: (2025)
URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
by: Zhang, Wangyou, et al.
Published: (2024)
by: Zhang, Wangyou, et al.
Published: (2024)
Efficient High-Performance Bark-Scale Neural Network for Residual Echo and Noise Suppression
by: Seidel, Ernst, et al.
Published: (2024)
by: Seidel, Ernst, et al.
Published: (2024)
Less is More: Data Curation Matters in Scaling Speech Enhancement
by: Li, Chenda, et al.
Published: (2025)
by: Li, Chenda, et al.
Published: (2025)
Lessons Learned from the URGENT 2024 Speech Enhancement Challenge
by: Zhang, Wangyou, et al.
Published: (2025)
by: Zhang, Wangyou, et al.
Published: (2025)
A Distilled Low-Latency Neural Vocoder with Explicit Amplitude and Phase Prediction
by: Du, Hui-Peng, et al.
Published: (2025)
by: Du, Hui-Peng, et al.
Published: (2025)
URGENT-PK: Perceptually-Aligned Ranking Model Designed for Speech Enhancement Competition
by: Wang, Jiahe, et al.
Published: (2025)
by: Wang, Jiahe, et al.
Published: (2025)
Wave-Trainer-Fit: Neural Vocoder with Trainable Prior and Fixed-Point Iteration towards High-Quality Speech Generation from SSL features
by: Ohnaka, Hien, et al.
Published: (2026)
by: Ohnaka, Hien, et al.
Published: (2026)
Neural Vocoders as Speech Enhancers
by: Li, Andong, et al.
Published: (2025)
by: Li, Andong, et al.
Published: (2025)
Comparative Analysis of Fast and High-Fidelity Neural Vocoders for Low-Latency Streaming Synthesis in Resource-Constrained Environments
by: Yoneyama, Reo, et al.
Published: (2025)
by: Yoneyama, Reo, et al.
Published: (2025)
SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations
by: Yang, Xiaoyu, et al.
Published: (2025)
by: Yang, Xiaoyu, et al.
Published: (2025)
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
by: Wagner, Dominik, et al.
Published: (2023)
by: Wagner, Dominik, et al.
Published: (2023)
Ultra-Low-Bitrate Mel-Spectrogram-based Neural Speech Coding with Flow-Matching-based Refinement and Vocoding-driven Reconstruction
by: Du, Hui-Peng, et al.
Published: (2026)
by: Du, Hui-Peng, et al.
Published: (2026)
Neural Kalman Filters for Acoustic Echo Cancellation
by: Seidel, Ernst, et al.
Published: (2025)
by: Seidel, Ernst, et al.
Published: (2025)
Ultra-lightweight Neural Differential DSP Vocoder For High Quality Speech Synthesis
by: Agrawal, Prabhav, et al.
Published: (2024)
by: Agrawal, Prabhav, et al.
Published: (2024)
SOA: Reducing Domain Mismatch in SSL Pipeline by Speech Only Adaptation for Low Resource ASR
by: Shankar, Natarajan Balaji, et al.
Published: (2024)
by: Shankar, Natarajan Balaji, et al.
Published: (2024)
How Far Do SSL Speech Models Listen for Tone? Temporal Focus of Tone Representation under Low-resource Transfer
by: Kim, Minu, et al.
Published: (2025)
by: Kim, Minu, et al.
Published: (2025)
Low-Resource Self-Supervised Learning with SSL-Enhanced TTS
by: Hsu, Po-chun, et al.
Published: (2023)
by: Hsu, Po-chun, et al.
Published: (2023)
k2SSL: A Faster and Better Framework for Self-Supervised Speech Representation Learning
by: Yang, Yifan, et al.
Published: (2024)
by: Yang, Yifan, et al.
Published: (2024)
Towards Out-of-Distribution Detection in Vocoder Recognition via Latent Feature Reconstruction
by: Du, Renmingyue, et al.
Published: (2024)
by: Du, Renmingyue, et al.
Published: (2024)
A Universal Harmonic Discriminator for High-quality GAN-based Vocoder
by: Xu, Nan, et al.
Published: (2025)
by: Xu, Nan, et al.
Published: (2025)
DiffVQE: Hybrid Diffusion Voice Quality Enhancement Under Acoustic Echo and Noise
by: Girao, Haljan Lugo, et al.
Published: (2026)
by: Girao, Haljan Lugo, et al.
Published: (2026)
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation
by: Du, Hui-Peng, et al.
Published: (2024)
by: Du, Hui-Peng, et al.
Published: (2024)
Speech World Model: Causal State-Action Planning with Explicit Reasoning for Speech
by: Zhou, Xuanru, et al.
Published: (2025)
by: Zhou, Xuanru, et al.
Published: (2025)
Improving Resource-Efficient Speech Enhancement via Neural Differentiable DSP Vocoder Refinement
by: Guimarães, Heitor R., et al.
Published: (2025)
by: Guimarães, Heitor R., et al.
Published: (2025)
Fusion of Modulation Spectrogram and SSL with Multi-head Attention for Fake Speech Detection
by: N, Rishith Sadashiv T, et al.
Published: (2025)
by: N, Rishith Sadashiv T, et al.
Published: (2025)
Ultra-Low Latency Speech Enhancement - A Comprehensive Study
by: Wu, Haibin, et al.
Published: (2024)
by: Wu, Haibin, et al.
Published: (2024)
Leveraging Self-Supervised Audio-Visual Pretrained Models to Improve Vocoded Speech Intelligibility in Cochlear Implant Simulation
by: Lai, Richard Lee, et al.
Published: (2023)
by: Lai, Richard Lee, et al.
Published: (2023)
FA-GAN: Artifacts-free and Phase-aware High-fidelity GAN-based Vocoder
by: Shen, Rubing, et al.
Published: (2024)
by: Shen, Rubing, et al.
Published: (2024)
LL-SDR: Low-Latency Speech enhancement through Discrete Representations
by: Li, Jingyi, et al.
Published: (2026)
by: Li, Jingyi, et al.
Published: (2026)
Is GAN Necessary for Mel-Spectrogram-based Neural Vocoder?
by: Du, Hui-Peng, et al.
Published: (2025)
by: Du, Hui-Peng, et al.
Published: (2025)
DENSE: Dynamic Embedding Causal Target Speech Extraction
by: Wang, Yiwen, et al.
Published: (2024)
by: Wang, Yiwen, et al.
Published: (2024)
UniEnc-CASSNAT: An Encoder-only Non-autoregressive ASR for Speech SSL Models
by: Fan, Ruchao, et al.
Published: (2024)
by: Fan, Ruchao, et al.
Published: (2024)
UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding
by: Du, Chenpeng, et al.
Published: (2023)
by: Du, Chenpeng, et al.
Published: (2023)
VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders
by: Cao, Yubing, et al.
Published: (2024)
by: Cao, Yubing, et al.
Published: (2024)
Causal Spatio-Temporal Sound Field Reconstruction
by: Sundström, David, et al.
Published: (2026)
by: Sundström, David, et al.
Published: (2026)
Causal Speech Enhancement with Predicting Semantics based on Quantized Self-supervised Learning Features
by: Tsunoo, Emiru, et al.
Published: (2024)
by: Tsunoo, Emiru, et al.
Published: (2024)
Similar Items
-
ICASSP 2026 URGENT Speech Enhancement Challenge
by: Li, Chenda, et al.
Published: (2026) -
Interspeech 2025 URGENT Speech Enhancement Challenge
by: Saijo, Kohei, et al.
Published: (2025) -
DisContSE: Single-Step Diffusion Speech Enhancement Based on Joint Discrete and Continuous Embeddings
by: Fu, Yihui, et al.
Published: (2026) -
P.808 Multilingual Speech Enhancement Testing: Approach and Results of URGENT 2025 Challenge
by: Sach, Marvin, et al.
Published: (2025) -
URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
by: Zhang, Wangyou, et al.
Published: (2024)