Saved in:
| Main Authors: | Mokrý, Ondřej, Rajmic, Pavel |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.04433 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Regularized autoregressive modeling and its application to audio signal reconstruction
by: Mokrý, Ondřej, et al.
Published: (2024)
by: Mokrý, Ondřej, et al.
Published: (2024)
Multiple Hankel matrix rank minimization for audio inpainting
by: Záviška, Pavel, et al.
Published: (2023)
by: Záviška, Pavel, et al.
Published: (2023)
Janssen 2.0: Audio Inpainting in the Time-frequency Domain
by: Mokrý, Ondřej, et al.
Published: (2024)
by: Mokrý, Ondřej, et al.
Published: (2024)
Audio Inpainting in Time-Frequency Domain with Phase-Aware Prior
by: Balušík, Peter, et al.
Published: (2026)
by: Balušík, Peter, et al.
Published: (2026)
A MATLAB toolbox for Computation of Speech Transmission Index (STI)
by: Rajmic, Pavel, et al.
Published: (2025)
by: Rajmic, Pavel, et al.
Published: (2025)
Blind estimation of audio effects using an auto-encoder approach and differentiable digital signal processing
by: Peladeau, Côme, et al.
Published: (2023)
by: Peladeau, Côme, et al.
Published: (2023)
Scaling up masked audio encoder learning for general audio classification
by: Dinkel, Heinrich, et al.
Published: (2024)
by: Dinkel, Heinrich, et al.
Published: (2024)
ICGAN: An implicit conditioning method for interpretable feature control of neural audio synthesis
by: Liu, Yunyi, et al.
Published: (2024)
by: Liu, Yunyi, et al.
Published: (2024)
Self-supervised learning method using multiple sampling strategies for general-purpose audio representation
by: Kuroyanagi, Ibuki, et al.
Published: (2025)
by: Kuroyanagi, Ibuki, et al.
Published: (2025)
Towards audio language modeling -- an overview
by: Wu, Haibin, et al.
Published: (2024)
by: Wu, Haibin, et al.
Published: (2024)
Why some audio signal short-time Fourier transform coefficients have nonuniform phase distributions
by: Voran, Stephen D.
Published: (2024)
by: Voran, Stephen D.
Published: (2024)
Compositional nonlinear audio signal processing with Volterra series
by: Araujo-Simon, Jake
Published: (2023)
by: Araujo-Simon, Jake
Published: (2023)
Are audio DeepFake detection models polyglots?
by: Marek, Bartłomiej, et al.
Published: (2024)
by: Marek, Bartłomiej, et al.
Published: (2024)
MBCodec:Thorough disentangle for high-fidelity audio compression
by: Zhang, Ruonan, et al.
Published: (2025)
by: Zhang, Ruonan, et al.
Published: (2025)
Real-time implementation of vibrato transfer as an audio effect
by: Hyrkas, Jeremy
Published: (2025)
by: Hyrkas, Jeremy
Published: (2025)
EDTC: enhance depth of text comprehension in automated audio captioning
by: Tan, Liwen, et al.
Published: (2024)
by: Tan, Liwen, et al.
Published: (2024)
Masked Self-distilled Transducer-based Keyword Spotting with Semi-autoregressive Decoding
by: Xi, Yu, et al.
Published: (2025)
by: Xi, Yu, et al.
Published: (2025)
Speaker anonymization using neural audio codec language models
by: Panariello, Michele, et al.
Published: (2023)
by: Panariello, Michele, et al.
Published: (2023)
FxSearcher: gradient-free text-driven audio transformation
by: Ki, Hojoon, et al.
Published: (2025)
by: Ki, Hojoon, et al.
Published: (2025)
Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition
by: An, Keyu, et al.
Published: (2024)
by: An, Keyu, et al.
Published: (2024)
MusicGen-Stem: Multi-stem music generation and edition through autoregressive modeling
by: Rouard, Simon, et al.
Published: (2025)
by: Rouard, Simon, et al.
Published: (2025)
AxLSTMs: learning self-supervised audio representations with xLSTMs
by: Yadav, Sarthak, et al.
Published: (2024)
by: Yadav, Sarthak, et al.
Published: (2024)
STASE: A spatialized text-to-audio synthesis engine for music generation
by: Chi, Tutti, et al.
Published: (2025)
by: Chi, Tutti, et al.
Published: (2025)
Deep learning based spatial aliasing reduction in beamforming for audio capture
by: Guzik, Mateusz, et al.
Published: (2025)
by: Guzik, Mateusz, et al.
Published: (2025)
Enhancement by postfiltering for speech and audio coding in ad-hoc sensor networks
by: Das, Sneha, et al.
Published: (2020)
by: Das, Sneha, et al.
Published: (2020)
Human-CLAP: Human-perception-based contrastive language-audio pretraining
by: Takano, Taisei, et al.
Published: (2025)
by: Takano, Taisei, et al.
Published: (2025)
DashengTokenizer: One layer is enough for unified audio understanding and generation
by: Dinkel, Heinrich, et al.
Published: (2026)
by: Dinkel, Heinrich, et al.
Published: (2026)
A robust audio deepfake detection system via multi-view feature
by: Yang, Yujie, et al.
Published: (2024)
by: Yang, Yujie, et al.
Published: (2024)
Reconstructing the Charlie Parker Omnibook using an audio-to-score automatic transcription pipeline
by: Riley, Xavier, et al.
Published: (2024)
by: Riley, Xavier, et al.
Published: (2024)
Exploring trends in audio mixes and masters: Insights from a dataset analysis
by: Mourgela, Angeliki, et al.
Published: (2024)
by: Mourgela, Angeliki, et al.
Published: (2024)
RELATE: Subjective evaluation dataset for automatic evaluation of relevance between text and audio
by: Kanamori, Yusuke, et al.
Published: (2025)
by: Kanamori, Yusuke, et al.
Published: (2025)
Modeling strategies for speech enhancement in the latent space of a neural audio codec
by: Kammoun, Sofiene, et al.
Published: (2025)
by: Kammoun, Sofiene, et al.
Published: (2025)
ACAVCaps: Enabling large-scale training for fine-grained and diverse audio understanding
by: Niu, Yadong, et al.
Published: (2026)
by: Niu, Yadong, et al.
Published: (2026)
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
by: Siuzdak, Hubert
Published: (2023)
by: Siuzdak, Hubert
Published: (2023)
A tunable binaural audio telepresence system capable of balancing immersive and enhanced modes
by: Hsu, Yicheng, et al.
Published: (2024)
by: Hsu, Yicheng, et al.
Published: (2024)
Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
by: Wu, Haibin, et al.
Published: (2024)
by: Wu, Haibin, et al.
Published: (2024)
Efficient learning-based sound propagation for virtual and real-world audio processing applications
by: Ratnarajah, Anton Jeran
Published: (2024)
by: Ratnarajah, Anton Jeran
Published: (2024)
ParaCLAP -- Towards a general language-audio model for computational paralinguistic tasks
by: Jing, Xin, et al.
Published: (2024)
by: Jing, Xin, et al.
Published: (2024)
WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms
by: Yuksel, Goksenin, et al.
Published: (2025)
by: Yuksel, Goksenin, et al.
Published: (2025)
audio2chart: End to End Audio Transcription into playable Guitar Hero charts
by: Tripodi, Riccardo
Published: (2025)
by: Tripodi, Riccardo
Published: (2025)
Similar Items
-
Regularized autoregressive modeling and its application to audio signal reconstruction
by: Mokrý, Ondřej, et al.
Published: (2024) -
Multiple Hankel matrix rank minimization for audio inpainting
by: Záviška, Pavel, et al.
Published: (2023) -
Janssen 2.0: Audio Inpainting in the Time-frequency Domain
by: Mokrý, Ondřej, et al.
Published: (2024) -
Audio Inpainting in Time-Frequency Domain with Phase-Aware Prior
by: Balušík, Peter, et al.
Published: (2026) -
A MATLAB toolbox for Computation of Speech Transmission Index (STI)
by: Rajmic, Pavel, et al.
Published: (2025)