:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Shi, Renzheng, Bär, Andreas, Sach, Marvin, Tirry, Wouter, Fingscheidt, Tim
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2408.11842
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ICASSP 2026 URGENT Speech Enhancement Challenge
by: Li, Chenda, et al.
Published: (2026)

Interspeech 2025 URGENT Speech Enhancement Challenge
by: Saijo, Kohei, et al.
Published: (2025)

DisContSE: Single-Step Diffusion Speech Enhancement Based on Joint Discrete and Continuous Embeddings
by: Fu, Yihui, et al.
Published: (2026)

P.808 Multilingual Speech Enhancement Testing: Approach and Results of URGENT 2025 Challenge
by: Sach, Marvin, et al.
Published: (2025)

URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
by: Zhang, Wangyou, et al.
Published: (2024)

Efficient High-Performance Bark-Scale Neural Network for Residual Echo and Noise Suppression
by: Seidel, Ernst, et al.
Published: (2024)

Less is More: Data Curation Matters in Scaling Speech Enhancement
by: Li, Chenda, et al.
Published: (2025)

Lessons Learned from the URGENT 2024 Speech Enhancement Challenge
by: Zhang, Wangyou, et al.
Published: (2025)

A Distilled Low-Latency Neural Vocoder with Explicit Amplitude and Phase Prediction
by: Du, Hui-Peng, et al.
Published: (2025)

URGENT-PK: Perceptually-Aligned Ranking Model Designed for Speech Enhancement Competition
by: Wang, Jiahe, et al.
Published: (2025)

Wave-Trainer-Fit: Neural Vocoder with Trainable Prior and Fixed-Point Iteration towards High-Quality Speech Generation from SSL features
by: Ohnaka, Hien, et al.
Published: (2026)

Neural Vocoders as Speech Enhancers
by: Li, Andong, et al.
Published: (2025)

Comparative Analysis of Fast and High-Fidelity Neural Vocoders for Low-Latency Streaming Synthesis in Resource-Constrained Environments
by: Yoneyama, Reo, et al.
Published: (2025)

SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations
by: Yang, Xiaoyu, et al.
Published: (2025)

Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
by: Wagner, Dominik, et al.
Published: (2023)

Ultra-Low-Bitrate Mel-Spectrogram-based Neural Speech Coding with Flow-Matching-based Refinement and Vocoding-driven Reconstruction
by: Du, Hui-Peng, et al.
Published: (2026)

Neural Kalman Filters for Acoustic Echo Cancellation
by: Seidel, Ernst, et al.
Published: (2025)

Ultra-lightweight Neural Differential DSP Vocoder For High Quality Speech Synthesis
by: Agrawal, Prabhav, et al.
Published: (2024)

SOA: Reducing Domain Mismatch in SSL Pipeline by Speech Only Adaptation for Low Resource ASR
by: Shankar, Natarajan Balaji, et al.
Published: (2024)

How Far Do SSL Speech Models Listen for Tone? Temporal Focus of Tone Representation under Low-resource Transfer
by: Kim, Minu, et al.
Published: (2025)

Low-Resource Self-Supervised Learning with SSL-Enhanced TTS
by: Hsu, Po-chun, et al.
Published: (2023)

k2SSL: A Faster and Better Framework for Self-Supervised Speech Representation Learning
by: Yang, Yifan, et al.
Published: (2024)

Towards Out-of-Distribution Detection in Vocoder Recognition via Latent Feature Reconstruction
by: Du, Renmingyue, et al.
Published: (2024)

A Universal Harmonic Discriminator for High-quality GAN-based Vocoder
by: Xu, Nan, et al.
Published: (2025)

DiffVQE: Hybrid Diffusion Voice Quality Enhancement Under Acoustic Echo and Noise
by: Girao, Haljan Lugo, et al.
Published: (2026)

BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation
by: Du, Hui-Peng, et al.
Published: (2024)

Speech World Model: Causal State-Action Planning with Explicit Reasoning for Speech
by: Zhou, Xuanru, et al.
Published: (2025)

Improving Resource-Efficient Speech Enhancement via Neural Differentiable DSP Vocoder Refinement
by: Guimarães, Heitor R., et al.
Published: (2025)

Fusion of Modulation Spectrogram and SSL with Multi-head Attention for Fake Speech Detection
by: N, Rishith Sadashiv T, et al.
Published: (2025)

Ultra-Low Latency Speech Enhancement - A Comprehensive Study
by: Wu, Haibin, et al.
Published: (2024)

Leveraging Self-Supervised Audio-Visual Pretrained Models to Improve Vocoded Speech Intelligibility in Cochlear Implant Simulation
by: Lai, Richard Lee, et al.
Published: (2023)

FA-GAN: Artifacts-free and Phase-aware High-fidelity GAN-based Vocoder
by: Shen, Rubing, et al.
Published: (2024)

LL-SDR: Low-Latency Speech enhancement through Discrete Representations
by: Li, Jingyi, et al.
Published: (2026)

Is GAN Necessary for Mel-Spectrogram-based Neural Vocoder?
by: Du, Hui-Peng, et al.
Published: (2025)

DENSE: Dynamic Embedding Causal Target Speech Extraction
by: Wang, Yiwen, et al.
Published: (2024)

UniEnc-CASSNAT: An Encoder-only Non-autoregressive ASR for Speech SSL Models
by: Fan, Ruchao, et al.
Published: (2024)

UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding
by: Du, Chenpeng, et al.
Published: (2023)

VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders
by: Cao, Yubing, et al.
Published: (2024)

Causal Spatio-Temporal Sound Field Reconstruction
by: Sundström, David, et al.
Published: (2026)

Causal Speech Enhancement with Predicting Semantics based on Quantized Self-supervised Learning Features
by: Tsunoo, Emiru, et al.
Published: (2024)