:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Nan, Huang, Zhaolong, Zeng, Xiao
Format:	Preprint
Published:	2025
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2512.03486
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MDDM: A Multi-view Discriminative Enhanced Diffusion-based Model for Speech Enhancement
by: Xu, Nan, et al.
Published: (2025)

FA-GAN: Artifacts-free and Phase-aware High-fidelity GAN-based Vocoder
by: Shen, Rubing, et al.
Published: (2024)

VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders
by: Cao, Yubing, et al.
Published: (2024)

Is GAN Necessary for Mel-Spectrogram-based Neural Vocoder?
by: Du, Hui-Peng, et al.
Published: (2025)

QHARMA-GAN: Quasi-Harmonic Neural Vocoder based on Autoregressive Moving Average Model
by: Chen, Shaowen, et al.
Published: (2025)

An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
by: Gu, Yicheng, et al.
Published: (2024)

Vocoder-Projected Feature Discriminator
by: Kaneko, Takuhiro, et al.
Published: (2025)

BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network
by: Shibuya, Takashi, et al.
Published: (2023)

BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation
by: Du, Hui-Peng, et al.
Published: (2024)

WaveNeXt 2: ConvNeXt-Based Fast Neural Vocoders With Residual Denoising and Sub-Modeling for GAN and Diffusion Models
by: Zhou, Wangzixi, et al.
Published: (2026)

Neural Vocoders as Speech Enhancers
by: Li, Andong, et al.
Published: (2025)

Ultra-Low-Bitrate Mel-Spectrogram-based Neural Speech Coding with Flow-Matching-based Refinement and Vocoding-driven Reconstruction
by: Du, Hui-Peng, et al.
Published: (2026)

Leveraging Discriminative Latent Representations for Conditioning GAN-Based Speech Enhancement
by: Shetu, Shrishti Saha, et al.
Published: (2025)

MusicHiFi: Fast High-Fidelity Stereo Vocoding
by: Zhu, Ge, et al.
Published: (2024)

ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram
by: Jiang, Xiao-Hang, et al.
Published: (2024)

Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator
by: Kaneko, Takuhiro, et al.
Published: (2024)

A Distilled Low-Latency Neural Vocoder with Explicit Amplitude and Phase Prediction
by: Du, Hui-Peng, et al.
Published: (2025)

Towards Out-of-Distribution Detection in Vocoder Recognition via Latent Feature Reconstruction
by: Du, Renmingyue, et al.
Published: (2024)

Training Universal Vocoders with Feature Smoothing-Based Augmentation Methods for High-Quality TTS Systems
by: Liu, Jeongmin, et al.
Published: (2024)

Non-Causal to Causal SSL-Supported Transfer Learning: Towards a High-Performance Low-Latency Speech Vocoder
by: Shi, Renzheng, et al.
Published: (2024)

Improving Resource-Efficient Speech Enhancement via Neural Differentiable DSP Vocoder Refinement
by: Guimarães, Heitor R., et al.
Published: (2025)

A Neural Denoising Vocoder for Clean Waveform Generation from Noisy Mel-Spectrogram based on Amplitude and Phase Predictions
by: Du, Hui-Peng, et al.
Published: (2024)

Comparative Analysis of Fast and High-Fidelity Neural Vocoders for Low-Latency Streaming Synthesis in Resource-Constrained Environments
by: Yoneyama, Reo, et al.
Published: (2025)

WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
by: Luo, Tianze, et al.
Published: (2025)

FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
by: Lv, Yuanjun, et al.
Published: (2024)

Leveraging Self-Supervised Audio-Visual Pretrained Models to Improve Vocoded Speech Intelligibility in Cochlear Implant Simulation
by: Lai, Richard Lee, et al.
Published: (2023)

Flow2GAN: Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-step High-Fidelity Audio Generation
by: Yao, Zengwei, et al.
Published: (2025)

Ultra-lightweight Neural Differential DSP Vocoder For High Quality Speech Synthesis
by: Agrawal, Prabhav, et al.
Published: (2024)

UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding
by: Du, Chenpeng, et al.
Published: (2023)

Enhancing Spectrogram Realism in Singing Voice Synthesis via Explicit Bandwidth Extension Prior to Vocoder
by: Yang, Runxuan, et al.
Published: (2025)

FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder
by: Nguyen, Tan Dat, et al.
Published: (2024)

Wave-Trainer-Fit: Neural Vocoder with Trainable Prior and Fixed-Point Iteration towards High-Quality Speech Generation from SSL features
by: Ohnaka, Hien, et al.
Published: (2026)

A Hybrid Discriminative and Generative System for Universal Speech Enhancement
by: Liu, Yinghao, et al.
Published: (2026)

Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
by: Wagner, Dominik, et al.
Published: (2023)

Pseudo-Cepstrum: Pitch Modification for Mel-Based Neural Vocoders
by: Ellinas, Nikolaos, et al.
Published: (2025)

Enhancing Kurdish Text-to-Speech with Native Corpus Training: A High-Quality WaveGlow Vocoder Approach
by: Abdullah, Abdulhady Abas, et al.
Published: (2024)

RingFormer: A Neural Vocoder with Ring Attention and Convolution-Augmented Transformer
by: Hong, Seongho, et al.
Published: (2025)

Real-Time Streaming Mel Vocoding with Generative Flow Matching
by: Welker, Simon, et al.
Published: (2025)

LDCodec: A high quality neural audio codec with low-complexity decoder
by: Jiang, Jiawei, et al.
Published: (2025)

ArrayDPS-Refine: Generative Refinement of Discriminative Multi-Channel Speech Enhancement
by: Xu, Zhongweiyang, et al.
Published: (2026)