Saved in:
| Main Authors: | Peeters, Geoffroy, Rafii, Zafar, Fuentes, Magdalena, Duan, Zhiyao, Benetos, Emmanouil, Nam, Juhan, Mitsufuji, Yuki |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.07205 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
30+ Years of Source Separation Research: Achievements and Future Challenges
by: Araki, Shoko, et al.
Published: (2025)
by: Araki, Shoko, et al.
Published: (2025)
SAR-LM: Symbolic Audio Reasoning with Large Language Models
by: Taheri, Termeh, et al.
Published: (2025)
by: Taheri, Termeh, et al.
Published: (2025)
Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model
by: Huang, Jiawen, et al.
Published: (2024)
by: Huang, Jiawen, et al.
Published: (2024)
Universal Music Representations? Evaluating Foundation Models on World Music Corpora
by: Papaioannou, Charilaos, et al.
Published: (2025)
by: Papaioannou, Charilaos, et al.
Published: (2025)
Can Large Language Models Predict Audio Effects Parameters from Natural Language?
by: Doh, Seungheon, et al.
Published: (2025)
by: Doh, Seungheon, et al.
Published: (2025)
LLM2Fx-Tools: Tool Calling For Music Post-Production
by: Doh, Seungheon, et al.
Published: (2025)
by: Doh, Seungheon, et al.
Published: (2025)
Scalable Evaluation for Audio Identification via Synthetic Latent Fingerprint Generation
by: Bhattacharjee, Aditya, et al.
Published: (2025)
by: Bhattacharjee, Aditya, et al.
Published: (2025)
Blind estimation of audio effects using an auto-encoder approach and differentiable digital signal processing
by: Peladeau, Côme, et al.
Published: (2023)
by: Peladeau, Côme, et al.
Published: (2023)
LC-Protonets: Multi-Label Few-Shot Learning for World Music Audio Tagging
by: Papaioannou, Charilaos, et al.
Published: (2024)
by: Papaioannou, Charilaos, et al.
Published: (2024)
Learning Music Audio Representations With Limited Data
by: Plachouras, Christos, et al.
Published: (2025)
by: Plachouras, Christos, et al.
Published: (2025)
RUMAA: Repeat-Aware Unified Music Audio Analysis for Score-Performance Alignment, Transcription, and Mistake Detection
by: Chang, Sungkyun, et al.
Published: (2025)
by: Chang, Sungkyun, et al.
Published: (2025)
Domain-Invariant Representation Learning of Bird Sounds
by: Moummad, Ilyass, et al.
Published: (2024)
by: Moummad, Ilyass, et al.
Published: (2024)
FlashSR: One-step Versatile Audio Super-resolution via Diffusion Distillation
by: Im, Jaekwon, et al.
Published: (2025)
by: Im, Jaekwon, et al.
Published: (2025)
DIFFRENT: A Diffusion Model for Recording Environment Transfer of Speech
by: Im, Jaekwon, et al.
Published: (2024)
by: Im, Jaekwon, et al.
Published: (2024)
Matchmaker: An Open-source Library for Real-time Piano Score Following and Systematic Evaluation
by: Park, Jiyun, et al.
Published: (2025)
by: Park, Jiyun, et al.
Published: (2025)
Acoustic identification of individual animals with hierarchical contrastive learning
by: Nolasco, Ines, et al.
Published: (2024)
by: Nolasco, Ines, et al.
Published: (2024)
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
by: Chang, Sungkyun, et al.
Published: (2024)
by: Chang, Sungkyun, et al.
Published: (2024)
LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging
by: Singh, Shubhr, et al.
Published: (2025)
by: Singh, Shubhr, et al.
Published: (2025)
SCRAPL: Scattering Transform with Random Paths for Machine Learning
by: Mitcheltree, Christopher, et al.
Published: (2026)
by: Mitcheltree, Christopher, et al.
Published: (2026)
CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following
by: Ma, Yinghao, et al.
Published: (2025)
by: Ma, Yinghao, et al.
Published: (2025)
A Contrastive Self-Supervised Learning scheme for beat tracking amenable to few-shot learning
by: Gagnere, Antonin, et al.
Published: (2024)
by: Gagnere, Antonin, et al.
Published: (2024)
Controlling Contrastive Self-Supervised Learning with Knowledge-Driven Multiple Hypothesis: Application to Beat Tracking
by: Gagnere, Antonin, et al.
Published: (2025)
by: Gagnere, Antonin, et al.
Published: (2025)
Episode-specific Fine-tuning for Metric-based Few-shot Learners with Optimization-based Training
by: Zhuang, Xuanyu, et al.
Published: (2025)
by: Zhuang, Xuanyu, et al.
Published: (2025)
Embryonic Exposure to VPA Influences Chick Vocalisations: A Computational Study
by: Torrisi, Antonella M. C., et al.
Published: (2026)
by: Torrisi, Antonella M. C., et al.
Published: (2026)
GraFPrint: A GNN-Based Approach for Audio Identification
by: Bhattacharjee, Aditya, et al.
Published: (2024)
by: Bhattacharjee, Aditya, et al.
Published: (2024)
Enhancing Lyrics Transcription on Music Mixtures with Consistency Loss
by: Huang, Jiawen, et al.
Published: (2025)
by: Huang, Jiawen, et al.
Published: (2025)
Motive-level Analysis of Form-functions Association in Korean Folk song
by: Han, Danbinaerin, et al.
Published: (2025)
by: Han, Danbinaerin, et al.
Published: (2025)
MATPAC++: Enhanced Masked Latent Prediction for Self-Supervised Audio Representation Learning
by: Quelennec, Aurian, et al.
Published: (2025)
by: Quelennec, Aurian, et al.
Published: (2025)
Masked Latent Prediction and Classification for Self-Supervised Audio Representation Learning
by: Quelennec, Aurian, et al.
Published: (2025)
by: Quelennec, Aurian, et al.
Published: (2025)
QINCODEC: Neural Audio Compression with Implicit Neural Codebooks
by: Lahrichi, Zineb, et al.
Published: (2025)
by: Lahrichi, Zineb, et al.
Published: (2025)
Text2Score: Generating Sheet Music From Textual Prompts
by: Bhandari, Keshav, et al.
Published: (2026)
by: Bhandari, Keshav, et al.
Published: (2026)
Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models
by: Postolache, Emilian, et al.
Published: (2024)
by: Postolache, Emilian, et al.
Published: (2024)
Classification of Spontaneous and Scripted Speech for Multilingual Audio
by: Elisha, Shahar, et al.
Published: (2024)
by: Elisha, Shahar, et al.
Published: (2024)
A Data-Driven Analysis of Robust Automatic Piano Transcription
by: Edwards, Drew, et al.
Published: (2024)
by: Edwards, Drew, et al.
Published: (2024)
Audio-JEPA: Joint-Embedding Predictive Architecture for Audio Representation Learning
by: Tuncay, Ludovic, et al.
Published: (2025)
by: Tuncay, Ludovic, et al.
Published: (2025)
PESTO: Pitch Estimation with Self-supervised Transposition-equivariant Objective
by: Riou, Alain, et al.
Published: (2023)
by: Riou, Alain, et al.
Published: (2023)
S-PRESSO: Ultra Low Bitrate Sound Effect Compression With Diffusion Autoencoders And Offline Quantization
by: Lahrichi, Zineb, et al.
Published: (2026)
by: Lahrichi, Zineb, et al.
Published: (2026)
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models
by: Weck, Benno, et al.
Published: (2024)
by: Weck, Benno, et al.
Published: (2024)
The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis
by: Torres, Bernardo, et al.
Published: (2025)
by: Torres, Bernardo, et al.
Published: (2025)
Unsupervised Harmonic Parameter Estimation Using Differentiable DSP and Spectral Optimal Transport
by: Torres, Bernardo, et al.
Published: (2023)
by: Torres, Bernardo, et al.
Published: (2023)
Similar Items
-
30+ Years of Source Separation Research: Achievements and Future Challenges
by: Araki, Shoko, et al.
Published: (2025) -
SAR-LM: Symbolic Audio Reasoning with Large Language Models
by: Taheri, Termeh, et al.
Published: (2025) -
Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model
by: Huang, Jiawen, et al.
Published: (2024) -
Universal Music Representations? Evaluating Foundation Models on World Music Corpora
by: Papaioannou, Charilaos, et al.
Published: (2025) -
Can Large Language Models Predict Audio Effects Parameters from Natural Language?
by: Doh, Seungheon, et al.
Published: (2025)