Saved in:
| Main Authors: | Xu, Nan, Huang, Zhaolong, Zeng, Xiao |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.03486 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MDDM: A Multi-view Discriminative Enhanced Diffusion-based Model for Speech Enhancement
by: Xu, Nan, et al.
Published: (2025)
by: Xu, Nan, et al.
Published: (2025)
FA-GAN: Artifacts-free and Phase-aware High-fidelity GAN-based Vocoder
by: Shen, Rubing, et al.
Published: (2024)
by: Shen, Rubing, et al.
Published: (2024)
VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders
by: Cao, Yubing, et al.
Published: (2024)
by: Cao, Yubing, et al.
Published: (2024)
Is GAN Necessary for Mel-Spectrogram-based Neural Vocoder?
by: Du, Hui-Peng, et al.
Published: (2025)
by: Du, Hui-Peng, et al.
Published: (2025)
QHARMA-GAN: Quasi-Harmonic Neural Vocoder based on Autoregressive Moving Average Model
by: Chen, Shaowen, et al.
Published: (2025)
by: Chen, Shaowen, et al.
Published: (2025)
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
by: Gu, Yicheng, et al.
Published: (2024)
by: Gu, Yicheng, et al.
Published: (2024)
Vocoder-Projected Feature Discriminator
by: Kaneko, Takuhiro, et al.
Published: (2025)
by: Kaneko, Takuhiro, et al.
Published: (2025)
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network
by: Shibuya, Takashi, et al.
Published: (2023)
by: Shibuya, Takashi, et al.
Published: (2023)
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation
by: Du, Hui-Peng, et al.
Published: (2024)
by: Du, Hui-Peng, et al.
Published: (2024)
WaveNeXt 2: ConvNeXt-Based Fast Neural Vocoders With Residual Denoising and Sub-Modeling for GAN and Diffusion Models
by: Zhou, Wangzixi, et al.
Published: (2026)
by: Zhou, Wangzixi, et al.
Published: (2026)
Neural Vocoders as Speech Enhancers
by: Li, Andong, et al.
Published: (2025)
by: Li, Andong, et al.
Published: (2025)
Ultra-Low-Bitrate Mel-Spectrogram-based Neural Speech Coding with Flow-Matching-based Refinement and Vocoding-driven Reconstruction
by: Du, Hui-Peng, et al.
Published: (2026)
by: Du, Hui-Peng, et al.
Published: (2026)
Leveraging Discriminative Latent Representations for Conditioning GAN-Based Speech Enhancement
by: Shetu, Shrishti Saha, et al.
Published: (2025)
by: Shetu, Shrishti Saha, et al.
Published: (2025)
MusicHiFi: Fast High-Fidelity Stereo Vocoding
by: Zhu, Ge, et al.
Published: (2024)
by: Zhu, Ge, et al.
Published: (2024)
ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram
by: Jiang, Xiao-Hang, et al.
Published: (2024)
by: Jiang, Xiao-Hang, et al.
Published: (2024)
Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator
by: Kaneko, Takuhiro, et al.
Published: (2024)
by: Kaneko, Takuhiro, et al.
Published: (2024)
A Distilled Low-Latency Neural Vocoder with Explicit Amplitude and Phase Prediction
by: Du, Hui-Peng, et al.
Published: (2025)
by: Du, Hui-Peng, et al.
Published: (2025)
Towards Out-of-Distribution Detection in Vocoder Recognition via Latent Feature Reconstruction
by: Du, Renmingyue, et al.
Published: (2024)
by: Du, Renmingyue, et al.
Published: (2024)
Training Universal Vocoders with Feature Smoothing-Based Augmentation Methods for High-Quality TTS Systems
by: Liu, Jeongmin, et al.
Published: (2024)
by: Liu, Jeongmin, et al.
Published: (2024)
Non-Causal to Causal SSL-Supported Transfer Learning: Towards a High-Performance Low-Latency Speech Vocoder
by: Shi, Renzheng, et al.
Published: (2024)
by: Shi, Renzheng, et al.
Published: (2024)
Improving Resource-Efficient Speech Enhancement via Neural Differentiable DSP Vocoder Refinement
by: Guimarães, Heitor R., et al.
Published: (2025)
by: Guimarães, Heitor R., et al.
Published: (2025)
A Neural Denoising Vocoder for Clean Waveform Generation from Noisy Mel-Spectrogram based on Amplitude and Phase Predictions
by: Du, Hui-Peng, et al.
Published: (2024)
by: Du, Hui-Peng, et al.
Published: (2024)
Comparative Analysis of Fast and High-Fidelity Neural Vocoders for Low-Latency Streaming Synthesis in Resource-Constrained Environments
by: Yoneyama, Reo, et al.
Published: (2025)
by: Yoneyama, Reo, et al.
Published: (2025)
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
by: Luo, Tianze, et al.
Published: (2025)
by: Luo, Tianze, et al.
Published: (2025)
FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
by: Lv, Yuanjun, et al.
Published: (2024)
by: Lv, Yuanjun, et al.
Published: (2024)
Leveraging Self-Supervised Audio-Visual Pretrained Models to Improve Vocoded Speech Intelligibility in Cochlear Implant Simulation
by: Lai, Richard Lee, et al.
Published: (2023)
by: Lai, Richard Lee, et al.
Published: (2023)
Flow2GAN: Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-step High-Fidelity Audio Generation
by: Yao, Zengwei, et al.
Published: (2025)
by: Yao, Zengwei, et al.
Published: (2025)
Ultra-lightweight Neural Differential DSP Vocoder For High Quality Speech Synthesis
by: Agrawal, Prabhav, et al.
Published: (2024)
by: Agrawal, Prabhav, et al.
Published: (2024)
UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding
by: Du, Chenpeng, et al.
Published: (2023)
by: Du, Chenpeng, et al.
Published: (2023)
Enhancing Spectrogram Realism in Singing Voice Synthesis via Explicit Bandwidth Extension Prior to Vocoder
by: Yang, Runxuan, et al.
Published: (2025)
by: Yang, Runxuan, et al.
Published: (2025)
FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder
by: Nguyen, Tan Dat, et al.
Published: (2024)
by: Nguyen, Tan Dat, et al.
Published: (2024)
Wave-Trainer-Fit: Neural Vocoder with Trainable Prior and Fixed-Point Iteration towards High-Quality Speech Generation from SSL features
by: Ohnaka, Hien, et al.
Published: (2026)
by: Ohnaka, Hien, et al.
Published: (2026)
A Hybrid Discriminative and Generative System for Universal Speech Enhancement
by: Liu, Yinghao, et al.
Published: (2026)
by: Liu, Yinghao, et al.
Published: (2026)
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
by: Wagner, Dominik, et al.
Published: (2023)
by: Wagner, Dominik, et al.
Published: (2023)
Pseudo-Cepstrum: Pitch Modification for Mel-Based Neural Vocoders
by: Ellinas, Nikolaos, et al.
Published: (2025)
by: Ellinas, Nikolaos, et al.
Published: (2025)
Enhancing Kurdish Text-to-Speech with Native Corpus Training: A High-Quality WaveGlow Vocoder Approach
by: Abdullah, Abdulhady Abas, et al.
Published: (2024)
by: Abdullah, Abdulhady Abas, et al.
Published: (2024)
RingFormer: A Neural Vocoder with Ring Attention and Convolution-Augmented Transformer
by: Hong, Seongho, et al.
Published: (2025)
by: Hong, Seongho, et al.
Published: (2025)
Real-Time Streaming Mel Vocoding with Generative Flow Matching
by: Welker, Simon, et al.
Published: (2025)
by: Welker, Simon, et al.
Published: (2025)
LDCodec: A high quality neural audio codec with low-complexity decoder
by: Jiang, Jiawei, et al.
Published: (2025)
by: Jiang, Jiawei, et al.
Published: (2025)
ArrayDPS-Refine: Generative Refinement of Discriminative Multi-Channel Speech Enhancement
by: Xu, Zhongweiyang, et al.
Published: (2026)
by: Xu, Zhongweiyang, et al.
Published: (2026)
Similar Items
-
MDDM: A Multi-view Discriminative Enhanced Diffusion-based Model for Speech Enhancement
by: Xu, Nan, et al.
Published: (2025) -
FA-GAN: Artifacts-free and Phase-aware High-fidelity GAN-based Vocoder
by: Shen, Rubing, et al.
Published: (2024) -
VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders
by: Cao, Yubing, et al.
Published: (2024) -
Is GAN Necessary for Mel-Spectrogram-based Neural Vocoder?
by: Du, Hui-Peng, et al.
Published: (2025) -
QHARMA-GAN: Quasi-Harmonic Neural Vocoder based on Autoregressive Moving Average Model
by: Chen, Shaowen, et al.
Published: (2025)