Saved in:
| Main Authors: | Torrisi, Antonella M. C., Nolasco, Inês, Sgadò, Paola, Versace, Elisabetta, Benetos, Emmanouil |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.12203 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Acoustic identification of individual animals with hierarchical contrastive learning
by: Nolasco, Ines, et al.
Published: (2024)
by: Nolasco, Ines, et al.
Published: (2024)
SAR-LM: Symbolic Audio Reasoning with Large Language Models
by: Taheri, Termeh, et al.
Published: (2025)
by: Taheri, Termeh, et al.
Published: (2025)
Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model
by: Huang, Jiawen, et al.
Published: (2024)
by: Huang, Jiawen, et al.
Published: (2024)
LC-Protonets: Multi-Label Few-Shot Learning for World Music Audio Tagging
by: Papaioannou, Charilaos, et al.
Published: (2024)
by: Papaioannou, Charilaos, et al.
Published: (2024)
Learning Music Audio Representations With Limited Data
by: Plachouras, Christos, et al.
Published: (2025)
by: Plachouras, Christos, et al.
Published: (2025)
RUMAA: Repeat-Aware Unified Music Audio Analysis for Score-Performance Alignment, Transcription, and Mistake Detection
by: Chang, Sungkyun, et al.
Published: (2025)
by: Chang, Sungkyun, et al.
Published: (2025)
Universal Music Representations? Evaluating Foundation Models on World Music Corpora
by: Papaioannou, Charilaos, et al.
Published: (2025)
by: Papaioannou, Charilaos, et al.
Published: (2025)
NSTR: Neural Spectral Transport Representation for Space-Varying Frequency Fields
by: Versace, Plein
Published: (2025)
by: Versace, Plein
Published: (2025)
Domain-Invariant Representation Learning of Bird Sounds
by: Moummad, Ilyass, et al.
Published: (2024)
by: Moummad, Ilyass, et al.
Published: (2024)
GraFPrint: A GNN-Based Approach for Audio Identification
by: Bhattacharjee, Aditya, et al.
Published: (2024)
by: Bhattacharjee, Aditya, et al.
Published: (2024)
SCRAPL: Scattering Transform with Random Paths for Machine Learning
by: Mitcheltree, Christopher, et al.
Published: (2026)
by: Mitcheltree, Christopher, et al.
Published: (2026)
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
by: Chang, Sungkyun, et al.
Published: (2024)
by: Chang, Sungkyun, et al.
Published: (2024)
LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging
by: Singh, Shubhr, et al.
Published: (2025)
by: Singh, Shubhr, et al.
Published: (2025)
Scalable Evaluation for Audio Identification via Synthetic Latent Fingerprint Generation
by: Bhattacharjee, Aditya, et al.
Published: (2025)
by: Bhattacharjee, Aditya, et al.
Published: (2025)
Enhancing Lyrics Transcription on Music Mixtures with Consistency Loss
by: Huang, Jiawen, et al.
Published: (2025)
by: Huang, Jiawen, et al.
Published: (2025)
A Data-Driven Analysis of Robust Automatic Piano Transcription
by: Edwards, Drew, et al.
Published: (2024)
by: Edwards, Drew, et al.
Published: (2024)
Text2Score: Generating Sheet Music From Textual Prompts
by: Bhandari, Keshav, et al.
Published: (2026)
by: Bhandari, Keshav, et al.
Published: (2026)
Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models
by: Postolache, Emilian, et al.
Published: (2024)
by: Postolache, Emilian, et al.
Published: (2024)
Classification of Spontaneous and Scripted Speech for Multilingual Audio
by: Elisha, Shahar, et al.
Published: (2024)
by: Elisha, Shahar, et al.
Published: (2024)
Audio-JEPA: Joint-Embedding Predictive Architecture for Audio Representation Learning
by: Tuncay, Ludovic, et al.
Published: (2025)
by: Tuncay, Ludovic, et al.
Published: (2025)
Rank-based loss for learning hierarchical representations
by: Nolasco, Ines, et al.
Published: (2021)
by: Nolasco, Ines, et al.
Published: (2021)
CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following
by: Ma, Yinghao, et al.
Published: (2025)
by: Ma, Yinghao, et al.
Published: (2025)
Twenty-Five Years of MIR Research: Achievements, Practices, Evaluations, and Future Challenges
by: Peeters, Geoffroy, et al.
Published: (2025)
by: Peeters, Geoffroy, et al.
Published: (2025)
Refining music sample identification with a self-supervised graph neural network
by: Bhattacharjee, Aditya, et al.
Published: (2025)
by: Bhattacharjee, Aditya, et al.
Published: (2025)
DB3V: A Dialect Dominated Dataset of Bird Vocalisation for Cross-corpus Bird Species Recognition
by: Jing, Xin, et al.
Published: (2024)
by: Jing, Xin, et al.
Published: (2024)
ST-ITO: Controlling Audio Effects for Style Transfer with Inference-Time Optimization
by: Steinmetz, Christian J., et al.
Published: (2024)
by: Steinmetz, Christian J., et al.
Published: (2024)
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models
by: Weck, Benno, et al.
Published: (2024)
by: Weck, Benno, et al.
Published: (2024)
A Soft Robotic Interface for Chick-Robot Affective Interactions
by: Chen, Jue, et al.
Published: (2026)
by: Chen, Jue, et al.
Published: (2026)
Learning to detect an animal sound from five examples
by: Nolasco, Inês, et al.
Published: (2023)
by: Nolasco, Inês, et al.
Published: (2023)
Quality Audio Prototyping: a prototype system for unified sound retrieval and procedural generation
by: Garcia, Nelly, et al.
Published: (2026)
by: Garcia, Nelly, et al.
Published: (2026)
Non-Verbal Vocalisations and their Challenges: Emotion, Privacy, Sparseness, and Real Life
by: Batliner, Anton, et al.
Published: (2025)
by: Batliner, Anton, et al.
Published: (2025)
MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response
by: Deng, Zihao, et al.
Published: (2023)
by: Deng, Zihao, et al.
Published: (2023)
WeaveMuse: An Open Agentic System for Multimodal Music Understanding and Generation
by: Karystinaios, Emmanouil
Published: (2025)
by: Karystinaios, Emmanouil
Published: (2025)
Can LLMs "Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation
by: Zhou, Ziya, et al.
Published: (2024)
by: Zhou, Ziya, et al.
Published: (2024)
SMUG-Explain: A Framework for Symbolic Music Graph Explanations
by: Karystinaios, Emmanouil, et al.
Published: (2024)
by: Karystinaios, Emmanouil, et al.
Published: (2024)
Language Models for Music Medicine Generation
by: Nikolakakis, Emmanouil, et al.
Published: (2024)
by: Nikolakakis, Emmanouil, et al.
Published: (2024)
LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
by: Zhuo, Le, et al.
Published: (2023)
by: Zhuo, Le, et al.
Published: (2023)
MUSE-Explainer: Counterfactual Explanations for Symbolic Music Graph Classification Models
by: Hilaire, Baptiste, et al.
Published: (2025)
by: Hilaire, Baptiste, et al.
Published: (2025)
Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection
by: Liang, Jinhua, et al.
Published: (2024)
by: Liang, Jinhua, et al.
Published: (2024)
CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction
by: Ma, Yinghao, et al.
Published: (2026)
by: Ma, Yinghao, et al.
Published: (2026)
Similar Items
-
Acoustic identification of individual animals with hierarchical contrastive learning
by: Nolasco, Ines, et al.
Published: (2024) -
SAR-LM: Symbolic Audio Reasoning with Large Language Models
by: Taheri, Termeh, et al.
Published: (2025) -
Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model
by: Huang, Jiawen, et al.
Published: (2024) -
LC-Protonets: Multi-Label Few-Shot Learning for World Music Audio Tagging
by: Papaioannou, Charilaos, et al.
Published: (2024) -
Learning Music Audio Representations With Limited Data
by: Plachouras, Christos, et al.
Published: (2025)