:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Hu, Patricia, Peter, Silvan, Widmer, Gerhard
Format:	Preprint
Published:	2026
Subjects:	Sound
Online Access:	https://arxiv.org/abs/2605.25951
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Precise and Simple Audio-to-Score Alignment
by: Peter, Silvan, et al.
Published: (2026)

How to Infer Repeat Structures in MIDI Performances
by: Peter, Silvan, et al.
Published: (2025)

Pairing Real-Time Piano Transcription with Symbol-level Tracking for Precise and Robust Score Following
by: Peter, Silvan, et al.
Published: (2025)

Exploring System Adaptations For Minimum Latency Real-Time Piano Transcription
by: Hu, Patricia, et al.
Published: (2025)

TheGlueNote: Learned Representations for Robust and Flexible Note Alignment
by: Peter, Silvan David, et al.
Published: (2024)

Sounding Out Reconstruction Error-Based Evaluation of Generative Models of Expressive Performance
by: Peter, Silvan David, et al.
Published: (2023)

Sound and Music Biases in Deep Music Transcription Models: A Systematic Analysis
by: Marták, Lukáš Samuel, et al.
Published: (2025)

Quantifying the Corpus Bias Problem in Automatic Music Transcription Systems
by: Marták, Lukáš Samuel, et al.
Published: (2024)

Towards Musically Informed Evaluation of Piano Transcription Models
by: Hu, Patricia, et al.
Published: (2024)

Expressivity-aware Music Performance Retrieval using Mid-level Perceptual Features and Emotion Word Embeddings
by: Chowdhury, Shreyan, et al.
Published: (2024)

Online Symbolic Music Alignment with Offline Reinforcement Learning
by: Peter, Silvan David
Published: (2023)

AnalysisGNN: Unified Music Analysis with Graph Neural Networks
by: Karystinaios, Emmanouil, et al.
Published: (2025)

A Study on the Data Distribution Gap in Music Emotion Recognition
by: Ching, Joann, et al.
Published: (2025)

MUSE-Explainer: Counterfactual Explanations for Symbolic Music Graph Classification Models
by: Hilaire, Baptiste, et al.
Published: (2025)

Fusing Audio and Metadata Embeddings Improves Language-based Audio Retrieval
by: Primus, Paul, et al.
Published: (2024)

MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations
by: Guo, Wenxiang, et al.
Published: (2025)

SMUG-Explain: A Framework for Symbolic Music Graph Explanations
by: Karystinaios, Emmanouil, et al.
Published: (2024)

Exploring Performance-Complexity Trade-Offs in Sound Event Detection Models
by: Morocutti, Tobias, et al.
Published: (2025)

MQAD: A Large-Scale Question Answering Dataset for Training Music Large Language Models
by: Ouyang, Zhihao, et al.
Published: (2025)

Tadabur: A Large-Scale Quran Audio Dataset
by: Alherran, Faisal
Published: (2026)

GraphMuse: A Library for Symbolic Music Graph Processing
by: Karystinaios, Emmanouil, et al.
Published: (2024)

How Far Can Pretrained LLMs Go in Symbolic Music? Controlled Comparisons of Supervised and Preference-based Adaptation
by: Kumar, Deepak, et al.
Published: (2026)

Scaling Multi-Talker ASR with Speaker-Agnostic Activity Streams
by: He, Xiluo, et al.
Published: (2025)

TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining
by: Primus, Paul, et al.
Published: (2025)

Are Inherently Interpretable Models More Robust? A Study In Music Emotion Recognition
by: Hoedt, Katharina, et al.
Published: (2025)

Estimated Audio-Caption Correspondences Improve Language-Based Audio Retrieval
by: Primus, Paul, et al.
Published: (2024)

Music Boomerang: Reusing Diffusion Models for Data Augmentation and Audio Manipulation
by: Fichtinger, Alexander, et al.
Published: (2025)

Beat this! Accurate beat tracking without DBN postprocessing
by: Foscarin, Francesco, et al.
Published: (2024)

MusicScore: A Dataset for Music Score Modeling and Generation
by: Lin, Yuheng, et al.
Published: (2024)

Estimating Musical Surprisal from Audio in Autoregressive Diffusion Model Noise Spaces
by: Bjare, Mathias Rose, et al.
Published: (2025)

MAGE: Modality-Agnostic Music Generation and Editing
by: Saleem, Muhammad Usama, et al.
Published: (2026)

Perceptually Aligning Representations of Music via Noise-Augmented Autoencoders
by: Bjare, Mathias Rose, et al.
Published: (2025)

Perception-Inspired Graph Convolution for Music Understanding Tasks
by: Karystinaios, Emmanouil, et al.
Published: (2024)

JamendoMaxCaps: A Large Scale Music-caption Dataset with Imputed Metadata
by: Roy, Abhinaba, et al.
Published: (2025)

CASPER: A Large Scale Spontaneous Speech Dataset
by: Xiao, Cihan, et al.
Published: (2025)

Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage Training
by: Schmid, Florian, et al.
Published: (2024)

Task-Agnostic Structured Pruning of Speech Representation Models
by: Wang, Haoyu, et al.
Published: (2023)

Multi-Stage Music Source Restoration with BandSplit-RoFormer Separation and HiFi++ GAN
by: Morocutti, Tobias, et al.
Published: (2026)

Creating a Good Teacher for Knowledge Distillation in Acoustic Scene Classification
by: Morocutti, Tobias, et al.
Published: (2025)

Oceanship: A Large-Scale Dataset for Underwater Audio Target Recognition
by: Li, Zeyu, et al.
Published: (2024)