Saved in:
| Main Authors: | Shen, Rubing, Ren, Yanzhen, Sun, Zongkun |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.04575 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Is GAN Necessary for Mel-Spectrogram-based Neural Vocoder?
by: Du, Hui-Peng, et al.
Published: (2025)
by: Du, Hui-Peng, et al.
Published: (2025)
A Universal Harmonic Discriminator for High-quality GAN-based Vocoder
by: Xu, Nan, et al.
Published: (2025)
by: Xu, Nan, et al.
Published: (2025)
QHARMA-GAN: Quasi-Harmonic Neural Vocoder based on Autoregressive Moving Average Model
by: Chen, Shaowen, et al.
Published: (2025)
by: Chen, Shaowen, et al.
Published: (2025)
VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders
by: Cao, Yubing, et al.
Published: (2024)
by: Cao, Yubing, et al.
Published: (2024)
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network
by: Shibuya, Takashi, et al.
Published: (2023)
by: Shibuya, Takashi, et al.
Published: (2023)
WaveNeXt 2: ConvNeXt-Based Fast Neural Vocoders With Residual Denoising and Sub-Modeling for GAN and Diffusion Models
by: Zhou, Wangzixi, et al.
Published: (2026)
by: Zhou, Wangzixi, et al.
Published: (2026)
Flow2GAN: Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-step High-Fidelity Audio Generation
by: Yao, Zengwei, et al.
Published: (2025)
by: Yao, Zengwei, et al.
Published: (2025)
Unrestricted Global Phase Bias-Aware Single-channel Speech Enhancement with Conformer-based Metric GAN
by: Zhang, Shiqi, et al.
Published: (2024)
by: Zhang, Shiqi, et al.
Published: (2024)
Neural Vocoders as Speech Enhancers
by: Li, Andong, et al.
Published: (2025)
by: Li, Andong, et al.
Published: (2025)
A Distilled Low-Latency Neural Vocoder with Explicit Amplitude and Phase Prediction
by: Du, Hui-Peng, et al.
Published: (2025)
by: Du, Hui-Peng, et al.
Published: (2025)
GAN-Based Multi-Microphone Spatial Target Speaker Extraction
by: Shetu, Shrishti Saha, et al.
Published: (2025)
by: Shetu, Shrishti Saha, et al.
Published: (2025)
DeepFilterGAN: A Full-band Real-time Speech Enhancement System with GAN-based Stochastic Regeneration
by: Serbest, Sanberk, et al.
Published: (2025)
by: Serbest, Sanberk, et al.
Published: (2025)
JenGAN: Stacked Shifted Filters in GAN-Based Speech Synthesis
by: Cho, Hyunjae, et al.
Published: (2024)
by: Cho, Hyunjae, et al.
Published: (2024)
FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder
by: Nguyen, Tan Dat, et al.
Published: (2024)
by: Nguyen, Tan Dat, et al.
Published: (2024)
Leveraging Discriminative Latent Representations for Conditioning GAN-Based Speech Enhancement
by: Shetu, Shrishti Saha, et al.
Published: (2025)
by: Shetu, Shrishti Saha, et al.
Published: (2025)
MaskCycleGAN-based Whisper to Normal Speech Conversion
by: Gupta, K. Rohith, et al.
Published: (2024)
by: Gupta, K. Rohith, et al.
Published: (2024)
Semantic Proximity Alignment: Towards Human Perception-consistent Audio Tagging by Aligning with Label Text Description
by: Liu, Wuyang, et al.
Published: (2023)
by: Liu, Wuyang, et al.
Published: (2023)
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation
by: Du, Hui-Peng, et al.
Published: (2024)
by: Du, Hui-Peng, et al.
Published: (2024)
SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis
by: Baoueb, Teysir, et al.
Published: (2024)
by: Baoueb, Teysir, et al.
Published: (2024)
TLDiffGAN: A Latent Diffusion-GAN Framework with Temporal Information Fusion for Anomalous Sound Detection
by: Ma, Chengyuan, et al.
Published: (2026)
by: Ma, Chengyuan, et al.
Published: (2026)
A Neural Denoising Vocoder for Clean Waveform Generation from Noisy Mel-Spectrogram based on Amplitude and Phase Predictions
by: Du, Hui-Peng, et al.
Published: (2024)
by: Du, Hui-Peng, et al.
Published: (2024)
DTT-BSR: GAN-based DTTNet with RoPE Transformer Enhancement for Music Source Restoration
by: Tan, Shihong, et al.
Published: (2026)
by: Tan, Shihong, et al.
Published: (2026)
Neurodyne: Neural Pitch Manipulation with Representation Learning and Cycle-Consistency GAN
by: Gu, Yicheng, et al.
Published: (2025)
by: Gu, Yicheng, et al.
Published: (2025)
AudioGAN: A Compact and Efficient Framework for Real-Time High-Fidelity Text-to-Audio Generation
by: Chung, HaeChun
Published: (2025)
by: Chung, HaeChun
Published: (2025)
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
by: Gu, Yicheng, et al.
Published: (2024)
by: Gu, Yicheng, et al.
Published: (2024)
MusicHiFi: Fast High-Fidelity Stereo Vocoding
by: Zhu, Ge, et al.
Published: (2024)
by: Zhu, Ge, et al.
Published: (2024)
Very Low Complexity Speech Synthesis Using Framewise Autoregressive GAN (FARGAN) with Pitch Prediction
by: Valin, Jean-Marc, et al.
Published: (2024)
by: Valin, Jean-Marc, et al.
Published: (2024)
Factorized RVQ-GAN For Disentangled Speech Tokenization
by: Khurana, Sameer, et al.
Published: (2025)
by: Khurana, Sameer, et al.
Published: (2025)
Disentanglement in a GAN for Unconditional Speech Synthesis
by: Baas, Matthew, et al.
Published: (2023)
by: Baas, Matthew, et al.
Published: (2023)
Ultra-Low-Bitrate Mel-Spectrogram-based Neural Speech Coding with Flow-Matching-based Refinement and Vocoding-driven Reconstruction
by: Du, Hui-Peng, et al.
Published: (2026)
by: Du, Hui-Peng, et al.
Published: (2026)
GAN-Based Speech Enhancement for Low SNR Using Latent Feature Conditioning
by: Shetu, Shrishti Saha, et al.
Published: (2024)
by: Shetu, Shrishti Saha, et al.
Published: (2024)
Towards Out-of-Distribution Detection in Vocoder Recognition via Latent Feature Reconstruction
by: Du, Renmingyue, et al.
Published: (2024)
by: Du, Renmingyue, et al.
Published: (2024)
Non-Causal to Causal SSL-Supported Transfer Learning: Towards a High-Performance Low-Latency Speech Vocoder
by: Shi, Renzheng, et al.
Published: (2024)
by: Shi, Renzheng, et al.
Published: (2024)
CMGAN: Conformer-based Metric GAN for Speech Enhancement
by: Cao, Ruizhe, et al.
Published: (2022)
by: Cao, Ruizhe, et al.
Published: (2022)
UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding
by: Du, Chenpeng, et al.
Published: (2023)
by: Du, Chenpeng, et al.
Published: (2023)
ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram
by: Jiang, Xiao-Hang, et al.
Published: (2024)
by: Jiang, Xiao-Hang, et al.
Published: (2024)
Comparative Analysis of Fast and High-Fidelity Neural Vocoders for Low-Latency Streaming Synthesis in Resource-Constrained Environments
by: Yoneyama, Reo, et al.
Published: (2025)
by: Yoneyama, Reo, et al.
Published: (2025)
FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
by: Lv, Yuanjun, et al.
Published: (2024)
by: Lv, Yuanjun, et al.
Published: (2024)
Leveraging Self-Supervised Audio-Visual Pretrained Models to Improve Vocoded Speech Intelligibility in Cochlear Implant Simulation
by: Lai, Richard Lee, et al.
Published: (2023)
by: Lai, Richard Lee, et al.
Published: (2023)
EchoFake: A Replay-Aware Dataset for Practical Speech Deepfake Detection
by: Zhang, Tong, et al.
Published: (2025)
by: Zhang, Tong, et al.
Published: (2025)
Similar Items
-
Is GAN Necessary for Mel-Spectrogram-based Neural Vocoder?
by: Du, Hui-Peng, et al.
Published: (2025) -
A Universal Harmonic Discriminator for High-quality GAN-based Vocoder
by: Xu, Nan, et al.
Published: (2025) -
QHARMA-GAN: Quasi-Harmonic Neural Vocoder based on Autoregressive Moving Average Model
by: Chen, Shaowen, et al.
Published: (2025) -
VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders
by: Cao, Yubing, et al.
Published: (2024) -
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network
by: Shibuya, Takashi, et al.
Published: (2023)