Saved in:
| Main Authors: | Hu, Patricia, Peter, Silvan, Widmer, Gerhard |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.25951 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Precise and Simple Audio-to-Score Alignment
by: Peter, Silvan, et al.
Published: (2026)
by: Peter, Silvan, et al.
Published: (2026)
How to Infer Repeat Structures in MIDI Performances
by: Peter, Silvan, et al.
Published: (2025)
by: Peter, Silvan, et al.
Published: (2025)
Pairing Real-Time Piano Transcription with Symbol-level Tracking for Precise and Robust Score Following
by: Peter, Silvan, et al.
Published: (2025)
by: Peter, Silvan, et al.
Published: (2025)
Exploring System Adaptations For Minimum Latency Real-Time Piano Transcription
by: Hu, Patricia, et al.
Published: (2025)
by: Hu, Patricia, et al.
Published: (2025)
TheGlueNote: Learned Representations for Robust and Flexible Note Alignment
by: Peter, Silvan David, et al.
Published: (2024)
by: Peter, Silvan David, et al.
Published: (2024)
Sounding Out Reconstruction Error-Based Evaluation of Generative Models of Expressive Performance
by: Peter, Silvan David, et al.
Published: (2023)
by: Peter, Silvan David, et al.
Published: (2023)
Sound and Music Biases in Deep Music Transcription Models: A Systematic Analysis
by: Marták, Lukáš Samuel, et al.
Published: (2025)
by: Marták, Lukáš Samuel, et al.
Published: (2025)
Quantifying the Corpus Bias Problem in Automatic Music Transcription Systems
by: Marták, Lukáš Samuel, et al.
Published: (2024)
by: Marták, Lukáš Samuel, et al.
Published: (2024)
Towards Musically Informed Evaluation of Piano Transcription Models
by: Hu, Patricia, et al.
Published: (2024)
by: Hu, Patricia, et al.
Published: (2024)
Expressivity-aware Music Performance Retrieval using Mid-level Perceptual Features and Emotion Word Embeddings
by: Chowdhury, Shreyan, et al.
Published: (2024)
by: Chowdhury, Shreyan, et al.
Published: (2024)
Online Symbolic Music Alignment with Offline Reinforcement Learning
by: Peter, Silvan David
Published: (2023)
by: Peter, Silvan David
Published: (2023)
AnalysisGNN: Unified Music Analysis with Graph Neural Networks
by: Karystinaios, Emmanouil, et al.
Published: (2025)
by: Karystinaios, Emmanouil, et al.
Published: (2025)
A Study on the Data Distribution Gap in Music Emotion Recognition
by: Ching, Joann, et al.
Published: (2025)
by: Ching, Joann, et al.
Published: (2025)
MUSE-Explainer: Counterfactual Explanations for Symbolic Music Graph Classification Models
by: Hilaire, Baptiste, et al.
Published: (2025)
by: Hilaire, Baptiste, et al.
Published: (2025)
Fusing Audio and Metadata Embeddings Improves Language-based Audio Retrieval
by: Primus, Paul, et al.
Published: (2024)
by: Primus, Paul, et al.
Published: (2024)
MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations
by: Guo, Wenxiang, et al.
Published: (2025)
by: Guo, Wenxiang, et al.
Published: (2025)
SMUG-Explain: A Framework for Symbolic Music Graph Explanations
by: Karystinaios, Emmanouil, et al.
Published: (2024)
by: Karystinaios, Emmanouil, et al.
Published: (2024)
Exploring Performance-Complexity Trade-Offs in Sound Event Detection Models
by: Morocutti, Tobias, et al.
Published: (2025)
by: Morocutti, Tobias, et al.
Published: (2025)
MQAD: A Large-Scale Question Answering Dataset for Training Music Large Language Models
by: Ouyang, Zhihao, et al.
Published: (2025)
by: Ouyang, Zhihao, et al.
Published: (2025)
Tadabur: A Large-Scale Quran Audio Dataset
by: Alherran, Faisal
Published: (2026)
by: Alherran, Faisal
Published: (2026)
GraphMuse: A Library for Symbolic Music Graph Processing
by: Karystinaios, Emmanouil, et al.
Published: (2024)
by: Karystinaios, Emmanouil, et al.
Published: (2024)
How Far Can Pretrained LLMs Go in Symbolic Music? Controlled Comparisons of Supervised and Preference-based Adaptation
by: Kumar, Deepak, et al.
Published: (2026)
by: Kumar, Deepak, et al.
Published: (2026)
Scaling Multi-Talker ASR with Speaker-Agnostic Activity Streams
by: He, Xiluo, et al.
Published: (2025)
by: He, Xiluo, et al.
Published: (2025)
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining
by: Primus, Paul, et al.
Published: (2025)
by: Primus, Paul, et al.
Published: (2025)
Are Inherently Interpretable Models More Robust? A Study In Music Emotion Recognition
by: Hoedt, Katharina, et al.
Published: (2025)
by: Hoedt, Katharina, et al.
Published: (2025)
Estimated Audio-Caption Correspondences Improve Language-Based Audio Retrieval
by: Primus, Paul, et al.
Published: (2024)
by: Primus, Paul, et al.
Published: (2024)
Music Boomerang: Reusing Diffusion Models for Data Augmentation and Audio Manipulation
by: Fichtinger, Alexander, et al.
Published: (2025)
by: Fichtinger, Alexander, et al.
Published: (2025)
Beat this! Accurate beat tracking without DBN postprocessing
by: Foscarin, Francesco, et al.
Published: (2024)
by: Foscarin, Francesco, et al.
Published: (2024)
MusicScore: A Dataset for Music Score Modeling and Generation
by: Lin, Yuheng, et al.
Published: (2024)
by: Lin, Yuheng, et al.
Published: (2024)
Estimating Musical Surprisal from Audio in Autoregressive Diffusion Model Noise Spaces
by: Bjare, Mathias Rose, et al.
Published: (2025)
by: Bjare, Mathias Rose, et al.
Published: (2025)
MAGE: Modality-Agnostic Music Generation and Editing
by: Saleem, Muhammad Usama, et al.
Published: (2026)
by: Saleem, Muhammad Usama, et al.
Published: (2026)
Perceptually Aligning Representations of Music via Noise-Augmented Autoencoders
by: Bjare, Mathias Rose, et al.
Published: (2025)
by: Bjare, Mathias Rose, et al.
Published: (2025)
Perception-Inspired Graph Convolution for Music Understanding Tasks
by: Karystinaios, Emmanouil, et al.
Published: (2024)
by: Karystinaios, Emmanouil, et al.
Published: (2024)
JamendoMaxCaps: A Large Scale Music-caption Dataset with Imputed Metadata
by: Roy, Abhinaba, et al.
Published: (2025)
by: Roy, Abhinaba, et al.
Published: (2025)
CASPER: A Large Scale Spontaneous Speech Dataset
by: Xiao, Cihan, et al.
Published: (2025)
by: Xiao, Cihan, et al.
Published: (2025)
Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage Training
by: Schmid, Florian, et al.
Published: (2024)
by: Schmid, Florian, et al.
Published: (2024)
Task-Agnostic Structured Pruning of Speech Representation Models
by: Wang, Haoyu, et al.
Published: (2023)
by: Wang, Haoyu, et al.
Published: (2023)
Multi-Stage Music Source Restoration with BandSplit-RoFormer Separation and HiFi++ GAN
by: Morocutti, Tobias, et al.
Published: (2026)
by: Morocutti, Tobias, et al.
Published: (2026)
Creating a Good Teacher for Knowledge Distillation in Acoustic Scene Classification
by: Morocutti, Tobias, et al.
Published: (2025)
by: Morocutti, Tobias, et al.
Published: (2025)
Oceanship: A Large-Scale Dataset for Underwater Audio Target Recognition
by: Li, Zeyu, et al.
Published: (2024)
by: Li, Zeyu, et al.
Published: (2024)
Similar Items
-
Precise and Simple Audio-to-Score Alignment
by: Peter, Silvan, et al.
Published: (2026) -
How to Infer Repeat Structures in MIDI Performances
by: Peter, Silvan, et al.
Published: (2025) -
Pairing Real-Time Piano Transcription with Symbol-level Tracking for Precise and Robust Score Following
by: Peter, Silvan, et al.
Published: (2025) -
Exploring System Adaptations For Minimum Latency Real-Time Piano Transcription
by: Hu, Patricia, et al.
Published: (2025) -
TheGlueNote: Learned Representations for Robust and Flexible Note Alignment
by: Peter, Silvan David, et al.
Published: (2024)