Saved in:
| Main Authors: | Sabra, Adam, Wronka, Cyprian, Mao, Michelle, Hijazi, Samer |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.12482 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enhancement
by: Lin, Zizhen, et al.
Published: (2024)
by: Lin, Zizhen, et al.
Published: (2024)
Application of Audio Fingerprinting Techniques for Real-Time Scalable Speech Retrieval and Speech Clusterization
by: Altwlkany, Kemal, et al.
Published: (2024)
by: Altwlkany, Kemal, et al.
Published: (2024)
Analyzing and reducing the synthetic-to-real transfer gap in Music Information Retrieval: the task of automatic drum transcription
by: Zehren, Mickaël, et al.
Published: (2024)
by: Zehren, Mickaël, et al.
Published: (2024)
Transfer Learning with Semi-Supervised Dataset Annotation for Birdcall Classification
by: Miyaguchi, Anthony, et al.
Published: (2023)
by: Miyaguchi, Anthony, et al.
Published: (2023)
Music Foundation Model as Generic Booster for Music Downstream Tasks
by: Liao, WeiHsiang, et al.
Published: (2024)
by: Liao, WeiHsiang, et al.
Published: (2024)
Hybrid Losses for Hierarchical Embedding Learning
by: Tian, Haokun, et al.
Published: (2025)
by: Tian, Haokun, et al.
Published: (2025)
Deconstructing Jazz Piano Style Using Machine Learning
by: Cheston, Huw, et al.
Published: (2025)
by: Cheston, Huw, et al.
Published: (2025)
A Novel Audio Representation for Music Genre Identification in MIR
by: Kamuni, Navin, et al.
Published: (2024)
by: Kamuni, Navin, et al.
Published: (2024)
EAViT: External Attention Vision Transformer for Audio Classification
by: Iqbal, Aquib, et al.
Published: (2024)
by: Iqbal, Aquib, et al.
Published: (2024)
Nested Music Transformer: Sequentially Decoding Compound Tokens in Symbolic Music and Audio Generation
by: Yoo, HaeJun, et al.
Published: (2024)
by: Yoo, HaeJun, et al.
Published: (2024)
Multi-label Cross-lingual automatic music genre classification from lyrics with Sentence BERT
by: Tavares, Tiago Fernandes, et al.
Published: (2025)
by: Tavares, Tiago Fernandes, et al.
Published: (2025)
Improving Musical Instrument Classification with Advanced Machine Learning Techniques
by: Chulev, Joanikij
Published: (2024)
by: Chulev, Joanikij
Published: (2024)
Dissecting Temporal Understanding in Text-to-Audio Retrieval
by: Oncescu, Andreea-Maria, et al.
Published: (2024)
by: Oncescu, Andreea-Maria, et al.
Published: (2024)
Emergent musical properties of a transformer under contrastive self-supervised learning
by: Kong, Yuexuan, et al.
Published: (2025)
by: Kong, Yuexuan, et al.
Published: (2025)
Uncertainty Estimation in the Real World: A Study on Music Emotion Recognition
by: Watcharasupat, Karn N., et al.
Published: (2025)
by: Watcharasupat, Karn N., et al.
Published: (2025)
Separate This, and All of these Things Around It: Music Source Separation via Hyperellipsoidal Queries
by: Watcharasupat, Karn N., et al.
Published: (2025)
by: Watcharasupat, Karn N., et al.
Published: (2025)
Universal Music Representations? Evaluating Foundation Models on World Music Corpora
by: Papaioannou, Charilaos, et al.
Published: (2025)
by: Papaioannou, Charilaos, et al.
Published: (2025)
From Real to Cloned Singer Identification
by: Desblancs, Dorian, et al.
Published: (2024)
by: Desblancs, Dorian, et al.
Published: (2024)
Speaker Retrieval in the Wild: Challenges, Effectiveness and Robustness
by: Loweimi, Erfan, et al.
Published: (2025)
by: Loweimi, Erfan, et al.
Published: (2025)
Adaptive Slimming for Scalable and Efficient Speech Enhancement
by: Miccini, Riccardo, et al.
Published: (2025)
by: Miccini, Riccardo, et al.
Published: (2025)
Scalable Speech Enhancement with Dynamic Channel Pruning
by: Miccini, Riccardo, et al.
Published: (2024)
by: Miccini, Riccardo, et al.
Published: (2024)
wav2graph: A Framework for Supervised Learning Knowledge Graph from Speech
by: Le-Duc, Khai, et al.
Published: (2024)
by: Le-Duc, Khai, et al.
Published: (2024)
Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech
by: Fu, Szu-Wei, et al.
Published: (2024)
by: Fu, Szu-Wei, et al.
Published: (2024)
Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech
by: de Oliveira, Danilo, et al.
Published: (2024)
by: de Oliveira, Danilo, et al.
Published: (2024)
MERGE -- A Bimodal Audio-Lyrics Dataset for Static Music Emotion Recognition
by: Louro, Pedro Lima, et al.
Published: (2024)
by: Louro, Pedro Lima, et al.
Published: (2024)
On the Effect of Data-Augmentation on Local Embedding Properties in the Contrastive Learning of Music Audio Representations
by: McCallum, Matthew C., et al.
Published: (2024)
by: McCallum, Matthew C., et al.
Published: (2024)
A Dataset and Baselines for Measuring and Predicting the Music Piece Memorability
by: Tseng, Li-Yang, et al.
Published: (2024)
by: Tseng, Li-Yang, et al.
Published: (2024)
ASK: Adaptive Self-improving Knowledge Framework for Audio Text Retrieval
by: Fu, Siyuan, et al.
Published: (2025)
by: Fu, Siyuan, et al.
Published: (2025)
Music Genre Classification: Ensemble Learning with Subcomponents-level Attention
by: Liu, Yichen, et al.
Published: (2024)
by: Liu, Yichen, et al.
Published: (2024)
Learning Normal Patterns in Musical Loops
by: Dadman, Shayan, et al.
Published: (2025)
by: Dadman, Shayan, et al.
Published: (2025)
Equivariance-based self-supervised learning for audio signal recovery from clipped measurements
by: Sechaud, Victor, et al.
Published: (2024)
by: Sechaud, Victor, et al.
Published: (2024)
CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval
by: Abootorabi, Mohammad Mahdi, et al.
Published: (2024)
by: Abootorabi, Mohammad Mahdi, et al.
Published: (2024)
SpeechDPR: End-to-End Spoken Passage Retrieval for Open-Domain Spoken Question Answering
by: Lin, Chyi-Jiunn, et al.
Published: (2024)
by: Lin, Chyi-Jiunn, et al.
Published: (2024)
EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation
by: Richter, Julius, et al.
Published: (2024)
by: Richter, Julius, et al.
Published: (2024)
Investigating the Effects of Diffusion-based Conditional Generative Speech Models Used for Speech Enhancement on Dysarthric Speech
by: Reszka, Joanna, et al.
Published: (2024)
by: Reszka, Joanna, et al.
Published: (2024)
Posterior Transition Modeling for Unsupervised Diffusion-Based Speech Enhancement
by: Sadeghi, Mostafa, et al.
Published: (2025)
by: Sadeghi, Mostafa, et al.
Published: (2025)
Test-Time Training for Speech Enhancement
by: Behera, Avishkar, et al.
Published: (2025)
by: Behera, Avishkar, et al.
Published: (2025)
Segment Length Matters: A Study of Segment Lengths on Audio Fingerprinting Performance
by: Gong, Ziling, et al.
Published: (2026)
by: Gong, Ziling, et al.
Published: (2026)
WikiMuTe: A web-sourced dataset of semantic descriptions for music audio
by: Weck, Benno, et al.
Published: (2023)
by: Weck, Benno, et al.
Published: (2023)
Similar but Faster: Manipulation of Tempo in Music Audio Embeddings for Tempo Prediction and Search
by: McCallum, Matthew C., et al.
Published: (2024)
by: McCallum, Matthew C., et al.
Published: (2024)
Similar Items
-
MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enhancement
by: Lin, Zizhen, et al.
Published: (2024) -
Application of Audio Fingerprinting Techniques for Real-Time Scalable Speech Retrieval and Speech Clusterization
by: Altwlkany, Kemal, et al.
Published: (2024) -
Analyzing and reducing the synthetic-to-real transfer gap in Music Information Retrieval: the task of automatic drum transcription
by: Zehren, Mickaël, et al.
Published: (2024) -
Transfer Learning with Semi-Supervised Dataset Annotation for Birdcall Classification
by: Miyaguchi, Anthony, et al.
Published: (2023) -
Music Foundation Model as Generic Booster for Music Downstream Tasks
by: Liao, WeiHsiang, et al.
Published: (2024)