Saved in:
| Main Authors: | Peter, Silvan, Hu, Patricia, Widmer, Gerhard |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.20014 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Pairing Real-Time Piano Transcription with Symbol-level Tracking for Precise and Robust Score Following
by: Peter, Silvan, et al.
Published: (2025)
by: Peter, Silvan, et al.
Published: (2025)
Score-Agnostic Structure Analysis in Large-Scale Performance Datasets
by: Hu, Patricia, et al.
Published: (2026)
by: Hu, Patricia, et al.
Published: (2026)
How to Infer Repeat Structures in MIDI Performances
by: Peter, Silvan, et al.
Published: (2025)
by: Peter, Silvan, et al.
Published: (2025)
TheGlueNote: Learned Representations for Robust and Flexible Note Alignment
by: Peter, Silvan David, et al.
Published: (2024)
by: Peter, Silvan David, et al.
Published: (2024)
Exploring System Adaptations For Minimum Latency Real-Time Piano Transcription
by: Hu, Patricia, et al.
Published: (2025)
by: Hu, Patricia, et al.
Published: (2025)
Online Symbolic Music Alignment with Offline Reinforcement Learning
by: Peter, Silvan David
Published: (2023)
by: Peter, Silvan David
Published: (2023)
Sounding Out Reconstruction Error-Based Evaluation of Generative Models of Expressive Performance
by: Peter, Silvan David, et al.
Published: (2023)
by: Peter, Silvan David, et al.
Published: (2023)
Fusing Audio and Metadata Embeddings Improves Language-based Audio Retrieval
by: Primus, Paul, et al.
Published: (2024)
by: Primus, Paul, et al.
Published: (2024)
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining
by: Primus, Paul, et al.
Published: (2025)
by: Primus, Paul, et al.
Published: (2025)
Estimated Audio-Caption Correspondences Improve Language-Based Audio Retrieval
by: Primus, Paul, et al.
Published: (2024)
by: Primus, Paul, et al.
Published: (2024)
Sound and Music Biases in Deep Music Transcription Models: A Systematic Analysis
by: Marták, Lukáš Samuel, et al.
Published: (2025)
by: Marták, Lukáš Samuel, et al.
Published: (2025)
Quantifying the Corpus Bias Problem in Automatic Music Transcription Systems
by: Marták, Lukáš Samuel, et al.
Published: (2024)
by: Marták, Lukáš Samuel, et al.
Published: (2024)
Music Boomerang: Reusing Diffusion Models for Data Augmentation and Audio Manipulation
by: Fichtinger, Alexander, et al.
Published: (2025)
by: Fichtinger, Alexander, et al.
Published: (2025)
Estimating Musical Surprisal in Audio
by: Bjare, Mathias Rose, et al.
Published: (2025)
by: Bjare, Mathias Rose, et al.
Published: (2025)
Estimating Musical Surprisal from Audio in Autoregressive Diffusion Model Noise Spaces
by: Bjare, Mathias Rose, et al.
Published: (2025)
by: Bjare, Mathias Rose, et al.
Published: (2025)
Just Label the Repeats for In-The-Wild Audio-to-Score Alignment
by: Bukey, Irmak, et al.
Published: (2024)
by: Bukey, Irmak, et al.
Published: (2024)
Towards Musically Informed Evaluation of Piano Transcription Models
by: Hu, Patricia, et al.
Published: (2024)
by: Hu, Patricia, et al.
Published: (2024)
Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage Training
by: Schmid, Florian, et al.
Published: (2024)
by: Schmid, Florian, et al.
Published: (2024)
Effective Pre-Training of Audio Transformers for Sound Event Detection
by: Schmid, Florian, et al.
Published: (2024)
by: Schmid, Florian, et al.
Published: (2024)
A Study on the Data Distribution Gap in Music Emotion Recognition
by: Ching, Joann, et al.
Published: (2025)
by: Ching, Joann, et al.
Published: (2025)
Audio-Visual Separation with Hierarchical Fusion and Representation Alignment
by: Hu, Han, et al.
Published: (2025)
by: Hu, Han, et al.
Published: (2025)
On Temporal Guidance and Iterative Refinement in Audio Source Separation
by: Morocutti, Tobias, et al.
Published: (2025)
by: Morocutti, Tobias, et al.
Published: (2025)
MoEScore: Mixture-of-Experts-Based Text-Audio Relevance Score Prediction for Text-to-Audio System Evaluation
by: Sun, Bochao, et al.
Published: (2026)
by: Sun, Bochao, et al.
Published: (2026)
pyAMPACT: A Score-Audio Alignment Toolkit for Performance Data Estimation and Multi-modal Processing
by: Devaney, Johanna, et al.
Published: (2024)
by: Devaney, Johanna, et al.
Published: (2024)
CosyAudio: Improving Audio Generation with Confidence Scores and Synthetic Captions
by: Zhu, Xinfa, et al.
Published: (2025)
by: Zhu, Xinfa, et al.
Published: (2025)
MUSE-Explainer: Counterfactual Explanations for Symbolic Music Graph Classification Models
by: Hilaire, Baptiste, et al.
Published: (2025)
by: Hilaire, Baptiste, et al.
Published: (2025)
Expressivity-aware Music Performance Retrieval using Mid-level Perceptual Features and Emotion Word Embeddings
by: Chowdhury, Shreyan, et al.
Published: (2024)
by: Chowdhury, Shreyan, et al.
Published: (2024)
StyleBreak: Revealing Alignment Vulnerabilities in Large Audio-Language Models via Style-Aware Audio Jailbreak
by: Li, Hongyi, et al.
Published: (2025)
by: Li, Hongyi, et al.
Published: (2025)
SMUG-Explain: A Framework for Symbolic Music Graph Explanations
by: Karystinaios, Emmanouil, et al.
Published: (2024)
by: Karystinaios, Emmanouil, et al.
Published: (2024)
AIBA: Attention-based Instrument Band Alignment for Text-to-Audio Diffusion
by: Koh, Junyoung, et al.
Published: (2025)
by: Koh, Junyoung, et al.
Published: (2025)
GraphMuse: A Library for Symbolic Music Graph Processing
by: Karystinaios, Emmanouil, et al.
Published: (2024)
by: Karystinaios, Emmanouil, et al.
Published: (2024)
How Far Can Pretrained LLMs Go in Symbolic Music? Controlled Comparisons of Supervised and Preference-based Adaptation
by: Kumar, Deepak, et al.
Published: (2026)
by: Kumar, Deepak, et al.
Published: (2026)
AnalysisGNN: Unified Music Analysis with Graph Neural Networks
by: Karystinaios, Emmanouil, et al.
Published: (2025)
by: Karystinaios, Emmanouil, et al.
Published: (2025)
RUMAA: Repeat-Aware Unified Music Audio Analysis for Score-Performance Alignment, Transcription, and Mistake Detection
by: Chang, Sungkyun, et al.
Published: (2025)
by: Chang, Sungkyun, et al.
Published: (2025)
MuseAgent-1: Interactive Grounded Multimodal Understanding of Music Scores and Performance Audio
by: Zhao, Qihao, et al.
Published: (2026)
by: Zhao, Qihao, et al.
Published: (2026)
Benign Fine-Tuning Breaks Safety Alignment in Audio LLMs
by: Roh, Jaechul, et al.
Published: (2026)
by: Roh, Jaechul, et al.
Published: (2026)
Robust Audio-Visual Segmentation via Audio-Guided Visual Convergent Alignment
by: Liu, Chen, et al.
Published: (2025)
by: Liu, Chen, et al.
Published: (2025)
Assessing the Alignment of Audio Representations with Timbre Similarity Ratings
by: Tian, Haokun, et al.
Published: (2025)
by: Tian, Haokun, et al.
Published: (2025)
Are Inherently Interpretable Models More Robust? A Study In Music Emotion Recognition
by: Hoedt, Katharina, et al.
Published: (2025)
by: Hoedt, Katharina, et al.
Published: (2025)
Beat this! Accurate beat tracking without DBN postprocessing
by: Foscarin, Francesco, et al.
Published: (2024)
by: Foscarin, Francesco, et al.
Published: (2024)
Similar Items
-
Pairing Real-Time Piano Transcription with Symbol-level Tracking for Precise and Robust Score Following
by: Peter, Silvan, et al.
Published: (2025) -
Score-Agnostic Structure Analysis in Large-Scale Performance Datasets
by: Hu, Patricia, et al.
Published: (2026) -
How to Infer Repeat Structures in MIDI Performances
by: Peter, Silvan, et al.
Published: (2025) -
TheGlueNote: Learned Representations for Robust and Flexible Note Alignment
by: Peter, Silvan David, et al.
Published: (2024) -
Exploring System Adaptations For Minimum Latency Real-Time Piano Transcription
by: Hu, Patricia, et al.
Published: (2025)