:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zheng, Rui-Chen, Du, Hui-Peng, Jiang, Xiao-Hang, Ai, Yang, Ling, Zhen-Hua
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2410.12359
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding
by: Ai, Yang, et al.
Published: (2024)

APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm
by: Du, Hui-Peng, et al.
Published: (2024)

MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High Sampling Rate and Low Bitrate Scenarios
by: Jiang, Xiao-Hang, et al.
Published: (2024)

CodeSep: Low-Bitrate Codec-Driven Speech Separation with Base-Token Disentanglement and Auxiliary-Token Serial Prediction
by: Du, Hui-Peng, et al.
Published: (2026)

A High-Quality and Low-Complexity Streamable Neural Speech Codec with Knowledge Distillation
by: Zhang, En-Wei, et al.
Published: (2025)

CFMDCTCodec: A Low-Bitrate Neural Speech Codec with Noise-Prior-aware Conditional Flow Matching for MDCT-Spectral Enhancement
by: Jiang, Xiao-Hang, et al.
Published: (2026)

Enhancing Noise Robustness for Neural Speech Codecs through Resource-Efficient Progressive Quantization Perturbation Simulation
by: Zheng, Rui-Chen, et al.
Published: (2025)

Vision-Integrated High-Quality Neural Speech Coding
by: Guo, Yao, et al.
Published: (2025)

Ultra-Low-Bitrate Mel-Spectrogram-based Neural Speech Coding with Flow-Matching-based Refinement and Vocoding-driven Reconstruction
by: Du, Hui-Peng, et al.
Published: (2026)

A Distilled Low-Latency Neural Vocoder with Explicit Amplitude and Phase Prediction
by: Du, Hui-Peng, et al.
Published: (2025)

Is GAN Necessary for Mel-Spectrogram-based Neural Vocoder?
by: Du, Hui-Peng, et al.
Published: (2025)

ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram
by: Jiang, Xiao-Hang, et al.
Published: (2024)

NDVQ: Robust Neural Audio Codec with Normal Distribution-Based Vector Quantization
by: Niu, Zhikang, et al.
Published: (2024)

UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook
by: Jiang, Yidi, et al.
Published: (2025)

AUV: Teaching Audio Universal Vector Quantization with Single Nested Codebook
by: Chen, Yushen, et al.
Published: (2025)

An Intra-BRNN and GB-RVQ Based END-TO-END Neural Audio Codec
by: Xu, Linping, et al.
Published: (2024)

BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation
by: Du, Hui-Peng, et al.
Published: (2024)

DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis
by: Lu, Ye-Xin, et al.
Published: (2025)

A Neural Denoising Vocoder for Clean Waveform Generation from Noisy Mel-Spectrogram based on Amplitude and Phase Predictions
by: Du, Hui-Peng, et al.
Published: (2024)

Stage-Wise and Prior-Aware Neural Speech Phase Prediction
by: Liu, Fei, et al.
Published: (2024)

Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis
by: Lu, Ye-Xin, et al.
Published: (2024)

SwitchCodec: A High-Fidelity Nerual Audio Codec With Sparse Quantization
by: Wang, Jin, et al.
Published: (2025)

SAMOS: A Neural MOS Prediction Model Leveraging Semantic Representations and Acoustic Features
by: Shi, Yu-Fei, et al.
Published: (2024)

Codec-Based Deepfake Source Tracing via Neural Audio Codec Taxonomy
by: Chen, Xuanjun, et al.
Published: (2025)

Bringing Interpretability to Neural Audio Codecs
by: Sadok, Samir, et al.
Published: (2025)

Variable Bitrate Residual Vector Quantization for Audio Coding
by: Chae, Yunkee, et al.
Published: (2024)

Single-Codec: Single-Codebook Speech Codec towards High-Performance Speech Generation
by: Li, Hanzhao, et al.
Published: (2024)

Low-Latency Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses for Speech Generation Tasks
by: Ai, Yang, et al.
Published: (2024)

DualCodec: A Low-Frame-Rate, Semantically-Enhanced Neural Audio Codec for Speech Generation
by: Li, Jiaqi, et al.
Published: (2025)

Enhancing Fully Formatted End-to-End Speech Recognition with Knowledge Distillation via Multi-Codebook Vector Quantization
by: You, Jian, et al.
Published: (2025)

CodecFake+: A Large-Scale Neural Audio Codec-Based Deepfake Speech Dataset
by: Chen, Xuanjun, et al.
Published: (2025)

MSR-Codec: A Low-Bitrate Multi-Stream Residual Codec for High-Fidelity Speech Generation with Information Disentanglement
by: Li, Jingyu, et al.
Published: (2025)

Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising
by: Lu, Ye-Xin, et al.
Published: (2025)

ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech
by: Shi, Jiatong, et al.
Published: (2024)

Towards Neural Audio Codec Source Parsing
by: Phukan, Orchid Chetia, et al.
Published: (2025)

SuperCodec: A Neural Speech Codec with Selective Back-Projection Network
by: Zheng, Youqiang, et al.
Published: (2024)

Analysing the Language of Neural Audio Codecs
by: Park, Joonyong, et al.
Published: (2025)

Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation
by: Li, Jiaqi, et al.
Published: (2024)

Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction
by: Lu, Ye-Xin, et al.
Published: (2024)

Acoustic Teleportation via Disentangled Neural Audio Codec Representations
by: Grundhuber, Philipp, et al.
Published: (2025)