Saved in:
| Main Authors: | Liem, Cynthia C. S., Taşcılar, Doğa, Demetriou, Andrew M. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.03676 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Music Enhancement with Deep Filters: A Technical Report for The ICASSP 2024 Cadenza Challenge
by: Shao, Keren, et al.
Published: (2024)
by: Shao, Keren, et al.
Published: (2024)
ICASSP 2026 URGENT Speech Enhancement Challenge
by: Li, Chenda, et al.
Published: (2026)
by: Li, Chenda, et al.
Published: (2026)
The ICASSP 2026 Automatic Song Aesthetics Evaluation Challenge
by: Ma, Guobin, et al.
Published: (2026)
by: Ma, Guobin, et al.
Published: (2026)
The ICASSP 2024 Audio Deep Packet Loss Concealment Challenge
by: Diener, Lorenz, et al.
Published: (2024)
by: Diener, Lorenz, et al.
Published: (2024)
ClaritySpeech: Dementia Obfuscation in Speech
by: Woszczyk, Dominika, et al.
Published: (2025)
by: Woszczyk, Dominika, et al.
Published: (2025)
Prosody-Driven Privacy-Preserving Dementia Detection
by: Woszczyk, Dominika, et al.
Published: (2024)
by: Woszczyk, Dominika, et al.
Published: (2024)
Unsupervised outlier detection to improve bird audio dataset labels
by: Collins, Bruce
Published: (2025)
by: Collins, Bruce
Published: (2025)
SCORE-SET: A dataset of GuitarPro files for Music Phrase Generation and Sequence Learning
by: Begari, Vishakh
Published: (2025)
by: Begari, Vishakh
Published: (2025)
Generalization in birdsong classification: impact of transfer learning methods and dataset characteristics
by: Ghani, Burooj, et al.
Published: (2024)
by: Ghani, Burooj, et al.
Published: (2024)
SLEEPING-DISCO 9M: A large-scale pre-training dataset for generative music modeling
by: Ahmed, Tawsif, et al.
Published: (2025)
by: Ahmed, Tawsif, et al.
Published: (2025)
Throat and acoustic paired speech dataset for deep learning-based speech enhancement
by: Kim, Yunsik, et al.
Published: (2025)
by: Kim, Yunsik, et al.
Published: (2025)
KS-Net: Multi-band joint speech restoration and enhancement network for 2024 ICASSP SSI Challenge
by: Yu, Guochen, et al.
Published: (2024)
by: Yu, Guochen, et al.
Published: (2024)
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
by: Chang, Sungkyun, et al.
Published: (2024)
by: Chang, Sungkyun, et al.
Published: (2024)
ProGress: Structured Music Generation via Graph Diffusion and Hierarchical Music Analysis
by: Ni-Hahn, Stephen, et al.
Published: (2025)
by: Ni-Hahn, Stephen, et al.
Published: (2025)
MidiCaps: A large-scale MIDI dataset with text captions
by: Melechovsky, Jan, et al.
Published: (2024)
by: Melechovsky, Jan, et al.
Published: (2024)
Detecting gamma-band responses to the speech envelope for the ICASSP 2024 Auditory EEG Decoding Signal Processing Grand Challenge
by: Thornton, Mike, et al.
Published: (2024)
by: Thornton, Mike, et al.
Published: (2024)
U-Mamba-Net: A highly efficient Mamba-based U-net style network for noisy and reverberant speech separation
by: Dang, Shaoxiang, et al.
Published: (2024)
by: Dang, Shaoxiang, et al.
Published: (2024)
Learning Disentangled Audio Representations through Controlled Synthesis
by: Brima, Yusuf, et al.
Published: (2024)
by: Brima, Yusuf, et al.
Published: (2024)
Identification of Cognitive Decline from Spoken Language through Feature Selection and the Bag of Acoustic Words Model
by: Niemelä, Marko, et al.
Published: (2024)
by: Niemelä, Marko, et al.
Published: (2024)
Mismatch-Robust Underwater Acoustic Localization Using A Differentiable Modular Forward Model
by: Kari, Dariush, et al.
Published: (2025)
by: Kari, Dariush, et al.
Published: (2025)
Boosting keyword spotting through on-device learnable user speech characteristics
by: Cioflan, Cristian, et al.
Published: (2024)
by: Cioflan, Cristian, et al.
Published: (2024)
Joint Source-Environment Adaptation for Deep Learning-Based Underwater Acoustic Source Ranging
by: Kari, Dariush, et al.
Published: (2025)
by: Kari, Dariush, et al.
Published: (2025)
A Data-Centric Framework for Machine Listening Projects: Addressing Large-Scale Data Acquisition and Labeling through Active Learning
by: Naranjo-Alcazar, Javier, et al.
Published: (2024)
by: Naranjo-Alcazar, Javier, et al.
Published: (2024)
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
by: Siuzdak, Hubert
Published: (2023)
by: Siuzdak, Hubert
Published: (2023)
Enhancing Audio-Language Models through Self-Supervised Post-Training with Text-Audio Pairs
by: Sinha, Anshuman, et al.
Published: (2024)
by: Sinha, Anshuman, et al.
Published: (2024)
SEF-MK: Speaker-Embedding-Free Voice Anonymization through Multi-k-means Quantization
by: Tang, Beilong, et al.
Published: (2025)
by: Tang, Beilong, et al.
Published: (2025)
Spectrotemporal Modulation: Efficient and Interpretable Feature Representation for Classifying Speech, Music, and Environmental Sounds
by: Chang, Andrew, et al.
Published: (2025)
by: Chang, Andrew, et al.
Published: (2025)
Advancing Robust Underwater Acoustic Target Recognition through Multi-task Learning and Multi-Gate Mixture-of-Experts
by: Xie, Yuan, et al.
Published: (2024)
by: Xie, Yuan, et al.
Published: (2024)
RiTTA: Modeling Event Relations in Text-to-Audio Generation
by: He, Yuhang, et al.
Published: (2024)
by: He, Yuhang, et al.
Published: (2024)
Joint Source-Environment Adaptation of Data-Driven Underwater Acoustic Source Ranging Based on Model Uncertainty
by: Kari, Dariush, et al.
Published: (2025)
by: Kari, Dariush, et al.
Published: (2025)
On the Condition Monitoring of Bolted Joints through Acoustic Emission and Deep Transfer Learning: Generalization, Ordinal Loss and Super-Convergence
by: Ramasso, Emmanuel, et al.
Published: (2024)
by: Ramasso, Emmanuel, et al.
Published: (2024)
ASTRA: Aligning Speech and Text Representations for Asr without Sampling
by: Gaur, Neeraj, et al.
Published: (2024)
by: Gaur, Neeraj, et al.
Published: (2024)
ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge
by: Wang, He, et al.
Published: (2024)
by: Wang, He, et al.
Published: (2024)
LLark: A Multimodal Instruction-Following Language Model for Music
by: Gardner, Josh, et al.
Published: (2023)
by: Gardner, Josh, et al.
Published: (2023)
A Feature Engineering Approach for Literary and Colloquial Tamil Speech Classification using 1D-CNN
by: Nanmalar, M., et al.
Published: (2024)
by: Nanmalar, M., et al.
Published: (2024)
FlowDec: A flow-based full-band general audio codec with high perceptual quality
by: Welker, Simon, et al.
Published: (2025)
by: Welker, Simon, et al.
Published: (2025)
Multimodal Lyrics-Rhythm Matching
by: Liao, Callie C., et al.
Published: (2023)
by: Liao, Callie C., et al.
Published: (2023)
Reconstruction of Sound Field through Diffusion Models
by: Miotello, Federico, et al.
Published: (2023)
by: Miotello, Federico, et al.
Published: (2023)
Utilizing TTS Synthesized Data for Efficient Development of Keyword Spotting Model
by: Park, Hyun Jin, et al.
Published: (2024)
by: Park, Hyun Jin, et al.
Published: (2024)
Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting
by: Park, Hyun Jin, et al.
Published: (2024)
by: Park, Hyun Jin, et al.
Published: (2024)
Similar Items
-
Music Enhancement with Deep Filters: A Technical Report for The ICASSP 2024 Cadenza Challenge
by: Shao, Keren, et al.
Published: (2024) -
ICASSP 2026 URGENT Speech Enhancement Challenge
by: Li, Chenda, et al.
Published: (2026) -
The ICASSP 2026 Automatic Song Aesthetics Evaluation Challenge
by: Ma, Guobin, et al.
Published: (2026) -
The ICASSP 2024 Audio Deep Packet Loss Concealment Challenge
by: Diener, Lorenz, et al.
Published: (2024) -
ClaritySpeech: Dementia Obfuscation in Speech
by: Woszczyk, Dominika, et al.
Published: (2025)