:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jiang, Xiao-Hang, Ai, Yang, Du, Hui-Peng, Ling, Zhen-Hua, Wu, Ji
Format:	Preprint
Published:	2026
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2605.26812
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High Sampling Rate and Low Bitrate Scenarios
by: Jiang, Xiao-Hang, et al.
Published: (2024)

Ultra-Low-Bitrate Mel-Spectrogram-based Neural Speech Coding with Flow-Matching-based Refinement and Vocoding-driven Reconstruction
by: Du, Hui-Peng, et al.
Published: (2026)

CodeSep: Low-Bitrate Codec-Driven Speech Separation with Base-Token Disentanglement and Auxiliary-Token Serial Prediction
by: Du, Hui-Peng, et al.
Published: (2026)

A High-Quality and Low-Complexity Streamable Neural Speech Codec with Knowledge Distillation
by: Zhang, En-Wei, et al.
Published: (2025)

BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec
by: Xin, Detai, et al.
Published: (2024)

VoCodec: An Efficient Lightweight Low-Bitrate Speech Codec
by: Yang, Leyan, et al.
Published: (2026)

SPG-Codec: Exploring the Role and Boundaries of Semantic Priors in Ultra-Low-Bitrate Neural Speech Coding
by: Zhao, Mingyu, et al.
Published: (2026)

ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio Codecs
by: Zheng, Rui-Chen, et al.
Published: (2024)

ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram
by: Jiang, Xiao-Hang, et al.
Published: (2024)

APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding
by: Ai, Yang, et al.
Published: (2024)

MSR-Codec: A Low-Bitrate Multi-Stream Residual Codec for High-Fidelity Speech Generation with Information Disentanglement
by: Li, Jingyu, et al.
Published: (2025)

Entropy-Guided GRVQ for Ultra-Low Bitrate Neural Speech Codec
by: Ren, Yanzhou, et al.
Published: (2026)

LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec
by: Guo, Yiwei, et al.
Published: (2024)

Enhancing Noise Robustness for Neural Speech Codecs through Resource-Efficient Progressive Quantization Perturbation Simulation
by: Zheng, Rui-Chen, et al.
Published: (2025)

A Distilled Low-Latency Neural Vocoder with Explicit Amplitude and Phase Prediction
by: Du, Hui-Peng, et al.
Published: (2025)

Optimizing Neural Speech Codec for Low-Bitrate Compression via Multi-Scale Encoding
by: Yang, Peiji, et al.
Published: (2024)

MuCodec: Ultra Low-Bitrate Music Codec
by: Xu, Yaoxun, et al.
Published: (2024)

Low-Latency Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses for Speech Generation Tasks
by: Ai, Yang, et al.
Published: (2024)

APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm
by: Du, Hui-Peng, et al.
Published: (2024)

Vision-Integrated High-Quality Neural Speech Coding
by: Guo, Yao, et al.
Published: (2025)

Assessing the Impact of Noise and Speech Enhancement on the Intelligibility of Speech Codecs
by: Behringer, Lyonel, et al.
Published: (2026)

Stage-Wise and Prior-Aware Neural Speech Phase Prediction
by: Liu, Fei, et al.
Published: (2024)

Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
by: Jiang, Xue, et al.
Published: (2025)

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra
by: Lu, Ye-Xin, et al.
Published: (2023)

XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs
by: Gong, Yitian, et al.
Published: (2025)

Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement
by: Lu, Ye-Xin, et al.
Published: (2023)

FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching
by: Jung, Chaeyoung, et al.
Published: (2024)

Towards Bitrate-Efficient and Noise-Robust Speech Coding with Variable Bitrate RVQ
by: Chae, Yunkee, et al.
Published: (2025)

A Neural Speech Codec for Noise Robust Speech Coding
by: Huang, Jiayi, et al.
Published: (2023)

FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
by: Della Libera, Luca, et al.
Published: (2025)

Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis
by: Lu, Ye-Xin, et al.
Published: (2024)

FocalCodec-Stream: Streaming Low-Bitrate Speech Coding via Causal Distillation
by: Della Libera, Luca, et al.
Published: (2025)

BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation
by: Du, Hui-Peng, et al.
Published: (2024)

Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising
by: Lu, Ye-Xin, et al.
Published: (2025)

Is GAN Necessary for Mel-Spectrogram-based Neural Vocoder?
by: Du, Hui-Peng, et al.
Published: (2025)

Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction
by: Lu, Ye-Xin, et al.
Published: (2024)

Spectral Codecs: Improving Non-Autoregressive Speech Synthesis with Spectrogram-Based Audio Codecs
by: Langman, Ryan, et al.
Published: (2024)

SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound
by: Liu, Haohe, et al.
Published: (2024)

DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis
by: Lu, Ye-Xin, et al.
Published: (2025)

Personalized Neural Speech Codec
by: Jang, Inseon, et al.
Published: (2024)