:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sabra, Adam, Wronka, Cyprian, Mao, Michelle, Hijazi, Samer
Format:	Preprint
Published:	2024
Subjects:	Sound Information Retrieval Machine Learning Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2402.12482
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enhancement
by: Lin, Zizhen, et al.
Published: (2024)

Application of Audio Fingerprinting Techniques for Real-Time Scalable Speech Retrieval and Speech Clusterization
by: Altwlkany, Kemal, et al.
Published: (2024)

Analyzing and reducing the synthetic-to-real transfer gap in Music Information Retrieval: the task of automatic drum transcription
by: Zehren, Mickaël, et al.
Published: (2024)

Transfer Learning with Semi-Supervised Dataset Annotation for Birdcall Classification
by: Miyaguchi, Anthony, et al.
Published: (2023)

Music Foundation Model as Generic Booster for Music Downstream Tasks
by: Liao, WeiHsiang, et al.
Published: (2024)

Hybrid Losses for Hierarchical Embedding Learning
by: Tian, Haokun, et al.
Published: (2025)

Deconstructing Jazz Piano Style Using Machine Learning
by: Cheston, Huw, et al.
Published: (2025)

A Novel Audio Representation for Music Genre Identification in MIR
by: Kamuni, Navin, et al.
Published: (2024)

EAViT: External Attention Vision Transformer for Audio Classification
by: Iqbal, Aquib, et al.
Published: (2024)

Nested Music Transformer: Sequentially Decoding Compound Tokens in Symbolic Music and Audio Generation
by: Yoo, HaeJun, et al.
Published: (2024)

Multi-label Cross-lingual automatic music genre classification from lyrics with Sentence BERT
by: Tavares, Tiago Fernandes, et al.
Published: (2025)

Improving Musical Instrument Classification with Advanced Machine Learning Techniques
by: Chulev, Joanikij
Published: (2024)

Dissecting Temporal Understanding in Text-to-Audio Retrieval
by: Oncescu, Andreea-Maria, et al.
Published: (2024)

Emergent musical properties of a transformer under contrastive self-supervised learning
by: Kong, Yuexuan, et al.
Published: (2025)

Uncertainty Estimation in the Real World: A Study on Music Emotion Recognition
by: Watcharasupat, Karn N., et al.
Published: (2025)

Separate This, and All of these Things Around It: Music Source Separation via Hyperellipsoidal Queries
by: Watcharasupat, Karn N., et al.
Published: (2025)

Universal Music Representations? Evaluating Foundation Models on World Music Corpora
by: Papaioannou, Charilaos, et al.
Published: (2025)

From Real to Cloned Singer Identification
by: Desblancs, Dorian, et al.
Published: (2024)

Speaker Retrieval in the Wild: Challenges, Effectiveness and Robustness
by: Loweimi, Erfan, et al.
Published: (2025)

Adaptive Slimming for Scalable and Efficient Speech Enhancement
by: Miccini, Riccardo, et al.
Published: (2025)

Scalable Speech Enhancement with Dynamic Channel Pruning
by: Miccini, Riccardo, et al.
Published: (2024)

wav2graph: A Framework for Supervised Learning Knowledge Graph from Speech
by: Le-Duc, Khai, et al.
Published: (2024)

Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech
by: Fu, Szu-Wei, et al.
Published: (2024)

Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech
by: de Oliveira, Danilo, et al.
Published: (2024)

MERGE -- A Bimodal Audio-Lyrics Dataset for Static Music Emotion Recognition
by: Louro, Pedro Lima, et al.
Published: (2024)

On the Effect of Data-Augmentation on Local Embedding Properties in the Contrastive Learning of Music Audio Representations
by: McCallum, Matthew C., et al.
Published: (2024)

A Dataset and Baselines for Measuring and Predicting the Music Piece Memorability
by: Tseng, Li-Yang, et al.
Published: (2024)

ASK: Adaptive Self-improving Knowledge Framework for Audio Text Retrieval
by: Fu, Siyuan, et al.
Published: (2025)

Music Genre Classification: Ensemble Learning with Subcomponents-level Attention
by: Liu, Yichen, et al.
Published: (2024)

Learning Normal Patterns in Musical Loops
by: Dadman, Shayan, et al.
Published: (2025)

Equivariance-based self-supervised learning for audio signal recovery from clipped measurements
by: Sechaud, Victor, et al.
Published: (2024)

CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval
by: Abootorabi, Mohammad Mahdi, et al.
Published: (2024)

SpeechDPR: End-to-End Spoken Passage Retrieval for Open-Domain Spoken Question Answering
by: Lin, Chyi-Jiunn, et al.
Published: (2024)

EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation
by: Richter, Julius, et al.
Published: (2024)

Investigating the Effects of Diffusion-based Conditional Generative Speech Models Used for Speech Enhancement on Dysarthric Speech
by: Reszka, Joanna, et al.
Published: (2024)

Posterior Transition Modeling for Unsupervised Diffusion-Based Speech Enhancement
by: Sadeghi, Mostafa, et al.
Published: (2025)

Test-Time Training for Speech Enhancement
by: Behera, Avishkar, et al.
Published: (2025)

Segment Length Matters: A Study of Segment Lengths on Audio Fingerprinting Performance
by: Gong, Ziling, et al.
Published: (2026)

WikiMuTe: A web-sourced dataset of semantic descriptions for music audio
by: Weck, Benno, et al.
Published: (2023)

Similar but Faster: Manipulation of Tempo in Music Audio Embeddings for Tempo Prediction and Search
by: McCallum, Matthew C., et al.
Published: (2024)