:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Torrisi, Antonella M. C., Nolasco, Inês, Sgadò, Paola, Versace, Elisabetta, Benetos, Emmanouil
Format:	Preprint
Published:	2026
Subjects:	Sound
Online Access:	https://arxiv.org/abs/2601.12203
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Acoustic identification of individual animals with hierarchical contrastive learning
by: Nolasco, Ines, et al.
Published: (2024)

SAR-LM: Symbolic Audio Reasoning with Large Language Models
by: Taheri, Termeh, et al.
Published: (2025)

Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model
by: Huang, Jiawen, et al.
Published: (2024)

LC-Protonets: Multi-Label Few-Shot Learning for World Music Audio Tagging
by: Papaioannou, Charilaos, et al.
Published: (2024)

Learning Music Audio Representations With Limited Data
by: Plachouras, Christos, et al.
Published: (2025)

RUMAA: Repeat-Aware Unified Music Audio Analysis for Score-Performance Alignment, Transcription, and Mistake Detection
by: Chang, Sungkyun, et al.
Published: (2025)

Universal Music Representations? Evaluating Foundation Models on World Music Corpora
by: Papaioannou, Charilaos, et al.
Published: (2025)

NSTR: Neural Spectral Transport Representation for Space-Varying Frequency Fields
by: Versace, Plein
Published: (2025)

Domain-Invariant Representation Learning of Bird Sounds
by: Moummad, Ilyass, et al.
Published: (2024)

GraFPrint: A GNN-Based Approach for Audio Identification
by: Bhattacharjee, Aditya, et al.
Published: (2024)

SCRAPL: Scattering Transform with Random Paths for Machine Learning
by: Mitcheltree, Christopher, et al.
Published: (2026)

YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
by: Chang, Sungkyun, et al.
Published: (2024)

LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging
by: Singh, Shubhr, et al.
Published: (2025)

Scalable Evaluation for Audio Identification via Synthetic Latent Fingerprint Generation
by: Bhattacharjee, Aditya, et al.
Published: (2025)

Enhancing Lyrics Transcription on Music Mixtures with Consistency Loss
by: Huang, Jiawen, et al.
Published: (2025)

A Data-Driven Analysis of Robust Automatic Piano Transcription
by: Edwards, Drew, et al.
Published: (2024)

Text2Score: Generating Sheet Music From Textual Prompts
by: Bhandari, Keshav, et al.
Published: (2026)

Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models
by: Postolache, Emilian, et al.
Published: (2024)

Classification of Spontaneous and Scripted Speech for Multilingual Audio
by: Elisha, Shahar, et al.
Published: (2024)

Audio-JEPA: Joint-Embedding Predictive Architecture for Audio Representation Learning
by: Tuncay, Ludovic, et al.
Published: (2025)

Rank-based loss for learning hierarchical representations
by: Nolasco, Ines, et al.
Published: (2021)

CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following
by: Ma, Yinghao, et al.
Published: (2025)

Twenty-Five Years of MIR Research: Achievements, Practices, Evaluations, and Future Challenges
by: Peeters, Geoffroy, et al.
Published: (2025)

Refining music sample identification with a self-supervised graph neural network
by: Bhattacharjee, Aditya, et al.
Published: (2025)

DB3V: A Dialect Dominated Dataset of Bird Vocalisation for Cross-corpus Bird Species Recognition
by: Jing, Xin, et al.
Published: (2024)

ST-ITO: Controlling Audio Effects for Style Transfer with Inference-Time Optimization
by: Steinmetz, Christian J., et al.
Published: (2024)

MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models
by: Weck, Benno, et al.
Published: (2024)

A Soft Robotic Interface for Chick-Robot Affective Interactions
by: Chen, Jue, et al.
Published: (2026)

Learning to detect an animal sound from five examples
by: Nolasco, Inês, et al.
Published: (2023)

Quality Audio Prototyping: a prototype system for unified sound retrieval and procedural generation
by: Garcia, Nelly, et al.
Published: (2026)

Non-Verbal Vocalisations and their Challenges: Emotion, Privacy, Sparseness, and Real Life
by: Batliner, Anton, et al.
Published: (2025)

MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response
by: Deng, Zihao, et al.
Published: (2023)

WeaveMuse: An Open Agentic System for Multimodal Music Understanding and Generation
by: Karystinaios, Emmanouil
Published: (2025)

Can LLMs "Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation
by: Zhou, Ziya, et al.
Published: (2024)

SMUG-Explain: A Framework for Symbolic Music Graph Explanations
by: Karystinaios, Emmanouil, et al.
Published: (2024)

Language Models for Music Medicine Generation
by: Nikolakakis, Emmanouil, et al.
Published: (2024)

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
by: Zhuo, Le, et al.
Published: (2023)

MUSE-Explainer: Counterfactual Explanations for Symbolic Music Graph Classification Models
by: Hilaire, Baptiste, et al.
Published: (2025)

Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection
by: Liang, Jinhua, et al.
Published: (2024)

CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction
by: Ma, Yinghao, et al.
Published: (2026)