Saved in:
| Main Authors: | Mack, Wolfgang, Mustafa, Ahmed, Łaganowski, Rafał, Hijazy, Samer |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.04770 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Low-Resource Audio Codec (LRAC): 2025 Challenge Description
by: Wojcicki, Kamil, et al.
Published: (2025)
by: Wojcicki, Kamil, et al.
Published: (2025)
SNAC: Multi-Scale Neural Audio Codec
by: Siuzdak, Hubert, et al.
Published: (2024)
by: Siuzdak, Hubert, et al.
Published: (2024)
RepCodec: A Speech Representation Codec for Speech Tokenization
by: Huang, Zhichao, et al.
Published: (2023)
by: Huang, Zhichao, et al.
Published: (2023)
Latent Granular Resynthesis using Neural Audio Codecs
by: Tokui, Nao, et al.
Published: (2025)
by: Tokui, Nao, et al.
Published: (2025)
Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models
by: Nercessian, Shahan, et al.
Published: (2024)
by: Nercessian, Shahan, et al.
Published: (2024)
A Closer Look at Neural Codec Resynthesis: Bridging the Gap between Codec and Waveform Generation
by: Liu, Alexander H., et al.
Published: (2024)
by: Liu, Alexander H., et al.
Published: (2024)
Learning Source Disentanglement in Neural Audio Codec
by: Bie, Xiaoyu, et al.
Published: (2024)
by: Bie, Xiaoyu, et al.
Published: (2024)
Towards Audio Codec-based Speech Separation
by: Yip, Jia Qi, et al.
Published: (2024)
by: Yip, Jia Qi, et al.
Published: (2024)
Speech Enhancement Using Continuous Embeddings of Neural Audio Codec
by: Li, Haoyang, et al.
Published: (2025)
by: Li, Haoyang, et al.
Published: (2025)
On the Relation Between Speech Quality and Quantized Latent Representations of Neural Codecs
by: Halimeh, Mhd Modar, et al.
Published: (2025)
by: Halimeh, Mhd Modar, et al.
Published: (2025)
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
by: Ji, Shengpeng, et al.
Published: (2024)
by: Ji, Shengpeng, et al.
Published: (2024)
Enhancing Noise Robustness for Neural Speech Codecs through Resource-Efficient Progressive Quantization Perturbation Simulation
by: Zheng, Rui-Chen, et al.
Published: (2025)
by: Zheng, Rui-Chen, et al.
Published: (2025)
EnCodecMAE: Leveraging neural codecs for universal audio representation learning
by: Pepino, Leonardo, et al.
Published: (2023)
by: Pepino, Leonardo, et al.
Published: (2023)
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
by: Wang, Xiaofei, et al.
Published: (2023)
by: Wang, Xiaofei, et al.
Published: (2023)
MagiCodec: Simple Masked Gaussian-Injected Codec for High-Fidelity Reconstruction and Generation
by: Song, Yakun, et al.
Published: (2025)
by: Song, Yakun, et al.
Published: (2025)
TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling
by: Wang, Yuancheng, et al.
Published: (2025)
by: Wang, Yuancheng, et al.
Published: (2025)
TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
by: Ji, Shengpeng, et al.
Published: (2023)
by: Ji, Shengpeng, et al.
Published: (2023)
SAC: Neural Speech Codec with Semantic-Acoustic Dual-Stream Quantization
by: Chen, Wenxi, et al.
Published: (2025)
by: Chen, Wenxi, et al.
Published: (2025)
PTQ4ADM: Post-Training Quantization for Efficient Text Conditional Audio Diffusion Models
by: Vora, Jayneel, et al.
Published: (2024)
by: Vora, Jayneel, et al.
Published: (2024)
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech
by: Shi, Jiatong, et al.
Published: (2024)
by: Shi, Jiatong, et al.
Published: (2024)
ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio Codecs
by: Zheng, Rui-Chen, et al.
Published: (2024)
by: Zheng, Rui-Chen, et al.
Published: (2024)
SwitchCodec: A High-Fidelity Nerual Audio Codec With Sparse Quantization
by: Wang, Jin, et al.
Published: (2025)
by: Wang, Jin, et al.
Published: (2025)
NDVQ: Robust Neural Audio Codec with Normal Distribution-Based Vector Quantization
by: Niu, Zhikang, et al.
Published: (2024)
by: Niu, Zhikang, et al.
Published: (2024)
A Context-Based Numerical Format Prediction for a Text-To-Speech System
by: Darwesh, Yaser, et al.
Published: (2024)
by: Darwesh, Yaser, et al.
Published: (2024)
Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations
by: Dhawan, Kunal, et al.
Published: (2024)
by: Dhawan, Kunal, et al.
Published: (2024)
VoCodec: An Efficient Lightweight Low-Bitrate Speech Codec
by: Yang, Leyan, et al.
Published: (2026)
by: Yang, Leyan, et al.
Published: (2026)
Gull: A Generative Multifunctional Audio Codec
by: Luo, Yi, et al.
Published: (2024)
by: Luo, Yi, et al.
Published: (2024)
LiVeAction: a Lightweight, Versatile, and Asymmetric Neural Codec Design for Real-time Operation
by: Jacobellis, Dan, et al.
Published: (2026)
by: Jacobellis, Dan, et al.
Published: (2026)
Towards Generalized Source Tracing for Codec-Based Deepfake Speech
by: Chen, Xuanjun, et al.
Published: (2025)
by: Chen, Xuanjun, et al.
Published: (2025)
Distinctive Feature Codec: An Adaptive Efficient Speech Representation for Depression Detection
by: Zhang, Xiangyu, et al.
Published: (2025)
by: Zhang, Xiangyu, et al.
Published: (2025)
Variable Bitrate Residual Vector Quantization for Audio Coding
by: Chae, Yunkee, et al.
Published: (2024)
by: Chae, Yunkee, et al.
Published: (2024)
CoDiCodec: Unifying Continuous and Discrete Compressed Representations of Audio
by: Pasini, Marco, et al.
Published: (2025)
by: Pasini, Marco, et al.
Published: (2025)
Investigating Disentanglement in a Phoneme-level Speech Codec for Prosody Modeling
by: Karapiperis, Sotirios, et al.
Published: (2024)
by: Karapiperis, Sotirios, et al.
Published: (2024)
Bringing Interpretability to Neural Audio Codecs
by: Sadok, Samir, et al.
Published: (2025)
by: Sadok, Samir, et al.
Published: (2025)
Evaluation of Neural Surrogates for Physical Modelling Synthesis of Nonlinear Elastic Plates
by: Martin, Carlos De La Vega, et al.
Published: (2025)
by: Martin, Carlos De La Vega, et al.
Published: (2025)
FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
by: Della Libera, Luca, et al.
Published: (2025)
by: Della Libera, Luca, et al.
Published: (2025)
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
by: Wang, Yuancheng, et al.
Published: (2024)
by: Wang, Yuancheng, et al.
Published: (2024)
Towards Efficient and Real-Time Piano Transcription Using Neural Autoregressive Models
by: Kwon, Taegyun, et al.
Published: (2024)
by: Kwon, Taegyun, et al.
Published: (2024)
SECP: A Speech Enhancement-Based Curation Pipeline For Scalable Acquisition Of Clean Speech
by: Sabra, Adam, et al.
Published: (2024)
by: Sabra, Adam, et al.
Published: (2024)
SegINR: Segment-wise Implicit Neural Representation for Sequence Alignment in Neural Text-to-Speech
by: Kim, Minchan, et al.
Published: (2024)
by: Kim, Minchan, et al.
Published: (2024)
Similar Items
-
Low-Resource Audio Codec (LRAC): 2025 Challenge Description
by: Wojcicki, Kamil, et al.
Published: (2025) -
SNAC: Multi-Scale Neural Audio Codec
by: Siuzdak, Hubert, et al.
Published: (2024) -
RepCodec: A Speech Representation Codec for Speech Tokenization
by: Huang, Zhichao, et al.
Published: (2023) -
Latent Granular Resynthesis using Neural Audio Codecs
by: Tokui, Nao, et al.
Published: (2025) -
Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models
by: Nercessian, Shahan, et al.
Published: (2024)