Saved in:
| Main Authors: | Jiang, Xiao-Hang, Ai, Yang, Du, Hui-Peng, Ling, Zhen-Hua, Wu, Ji |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.26812 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High Sampling Rate and Low Bitrate Scenarios
by: Jiang, Xiao-Hang, et al.
Published: (2024)
by: Jiang, Xiao-Hang, et al.
Published: (2024)
Ultra-Low-Bitrate Mel-Spectrogram-based Neural Speech Coding with Flow-Matching-based Refinement and Vocoding-driven Reconstruction
by: Du, Hui-Peng, et al.
Published: (2026)
by: Du, Hui-Peng, et al.
Published: (2026)
CodeSep: Low-Bitrate Codec-Driven Speech Separation with Base-Token Disentanglement and Auxiliary-Token Serial Prediction
by: Du, Hui-Peng, et al.
Published: (2026)
by: Du, Hui-Peng, et al.
Published: (2026)
A High-Quality and Low-Complexity Streamable Neural Speech Codec with Knowledge Distillation
by: Zhang, En-Wei, et al.
Published: (2025)
by: Zhang, En-Wei, et al.
Published: (2025)
BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec
by: Xin, Detai, et al.
Published: (2024)
by: Xin, Detai, et al.
Published: (2024)
VoCodec: An Efficient Lightweight Low-Bitrate Speech Codec
by: Yang, Leyan, et al.
Published: (2026)
by: Yang, Leyan, et al.
Published: (2026)
SPG-Codec: Exploring the Role and Boundaries of Semantic Priors in Ultra-Low-Bitrate Neural Speech Coding
by: Zhao, Mingyu, et al.
Published: (2026)
by: Zhao, Mingyu, et al.
Published: (2026)
ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio Codecs
by: Zheng, Rui-Chen, et al.
Published: (2024)
by: Zheng, Rui-Chen, et al.
Published: (2024)
ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram
by: Jiang, Xiao-Hang, et al.
Published: (2024)
by: Jiang, Xiao-Hang, et al.
Published: (2024)
APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding
by: Ai, Yang, et al.
Published: (2024)
by: Ai, Yang, et al.
Published: (2024)
MSR-Codec: A Low-Bitrate Multi-Stream Residual Codec for High-Fidelity Speech Generation with Information Disentanglement
by: Li, Jingyu, et al.
Published: (2025)
by: Li, Jingyu, et al.
Published: (2025)
Entropy-Guided GRVQ for Ultra-Low Bitrate Neural Speech Codec
by: Ren, Yanzhou, et al.
Published: (2026)
by: Ren, Yanzhou, et al.
Published: (2026)
LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec
by: Guo, Yiwei, et al.
Published: (2024)
by: Guo, Yiwei, et al.
Published: (2024)
Enhancing Noise Robustness for Neural Speech Codecs through Resource-Efficient Progressive Quantization Perturbation Simulation
by: Zheng, Rui-Chen, et al.
Published: (2025)
by: Zheng, Rui-Chen, et al.
Published: (2025)
A Distilled Low-Latency Neural Vocoder with Explicit Amplitude and Phase Prediction
by: Du, Hui-Peng, et al.
Published: (2025)
by: Du, Hui-Peng, et al.
Published: (2025)
Optimizing Neural Speech Codec for Low-Bitrate Compression via Multi-Scale Encoding
by: Yang, Peiji, et al.
Published: (2024)
by: Yang, Peiji, et al.
Published: (2024)
MuCodec: Ultra Low-Bitrate Music Codec
by: Xu, Yaoxun, et al.
Published: (2024)
by: Xu, Yaoxun, et al.
Published: (2024)
Low-Latency Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses for Speech Generation Tasks
by: Ai, Yang, et al.
Published: (2024)
by: Ai, Yang, et al.
Published: (2024)
APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm
by: Du, Hui-Peng, et al.
Published: (2024)
by: Du, Hui-Peng, et al.
Published: (2024)
Vision-Integrated High-Quality Neural Speech Coding
by: Guo, Yao, et al.
Published: (2025)
by: Guo, Yao, et al.
Published: (2025)
Assessing the Impact of Noise and Speech Enhancement on the Intelligibility of Speech Codecs
by: Behringer, Lyonel, et al.
Published: (2026)
by: Behringer, Lyonel, et al.
Published: (2026)
Stage-Wise and Prior-Aware Neural Speech Phase Prediction
by: Liu, Fei, et al.
Published: (2024)
by: Liu, Fei, et al.
Published: (2024)
Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
by: Jiang, Xue, et al.
Published: (2025)
by: Jiang, Xue, et al.
Published: (2025)
MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra
by: Lu, Ye-Xin, et al.
Published: (2023)
by: Lu, Ye-Xin, et al.
Published: (2023)
XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs
by: Gong, Yitian, et al.
Published: (2025)
by: Gong, Yitian, et al.
Published: (2025)
Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement
by: Lu, Ye-Xin, et al.
Published: (2023)
by: Lu, Ye-Xin, et al.
Published: (2023)
FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching
by: Jung, Chaeyoung, et al.
Published: (2024)
by: Jung, Chaeyoung, et al.
Published: (2024)
Towards Bitrate-Efficient and Noise-Robust Speech Coding with Variable Bitrate RVQ
by: Chae, Yunkee, et al.
Published: (2025)
by: Chae, Yunkee, et al.
Published: (2025)
A Neural Speech Codec for Noise Robust Speech Coding
by: Huang, Jiayi, et al.
Published: (2023)
by: Huang, Jiayi, et al.
Published: (2023)
FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
by: Della Libera, Luca, et al.
Published: (2025)
by: Della Libera, Luca, et al.
Published: (2025)
Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis
by: Lu, Ye-Xin, et al.
Published: (2024)
by: Lu, Ye-Xin, et al.
Published: (2024)
FocalCodec-Stream: Streaming Low-Bitrate Speech Coding via Causal Distillation
by: Della Libera, Luca, et al.
Published: (2025)
by: Della Libera, Luca, et al.
Published: (2025)
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation
by: Du, Hui-Peng, et al.
Published: (2024)
by: Du, Hui-Peng, et al.
Published: (2024)
Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising
by: Lu, Ye-Xin, et al.
Published: (2025)
by: Lu, Ye-Xin, et al.
Published: (2025)
Is GAN Necessary for Mel-Spectrogram-based Neural Vocoder?
by: Du, Hui-Peng, et al.
Published: (2025)
by: Du, Hui-Peng, et al.
Published: (2025)
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction
by: Lu, Ye-Xin, et al.
Published: (2024)
by: Lu, Ye-Xin, et al.
Published: (2024)
Spectral Codecs: Improving Non-Autoregressive Speech Synthesis with Spectrogram-Based Audio Codecs
by: Langman, Ryan, et al.
Published: (2024)
by: Langman, Ryan, et al.
Published: (2024)
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound
by: Liu, Haohe, et al.
Published: (2024)
by: Liu, Haohe, et al.
Published: (2024)
DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis
by: Lu, Ye-Xin, et al.
Published: (2025)
by: Lu, Ye-Xin, et al.
Published: (2025)
Personalized Neural Speech Codec
by: Jang, Inseon, et al.
Published: (2024)
by: Jang, Inseon, et al.
Published: (2024)
Similar Items
-
MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High Sampling Rate and Low Bitrate Scenarios
by: Jiang, Xiao-Hang, et al.
Published: (2024) -
Ultra-Low-Bitrate Mel-Spectrogram-based Neural Speech Coding with Flow-Matching-based Refinement and Vocoding-driven Reconstruction
by: Du, Hui-Peng, et al.
Published: (2026) -
CodeSep: Low-Bitrate Codec-Driven Speech Separation with Base-Token Disentanglement and Auxiliary-Token Serial Prediction
by: Du, Hui-Peng, et al.
Published: (2026) -
A High-Quality and Low-Complexity Streamable Neural Speech Codec with Knowledge Distillation
by: Zhang, En-Wei, et al.
Published: (2025) -
BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec
by: Xin, Detai, et al.
Published: (2024)