Saved in:
| Main Authors: | Zhang, Huan, Chowdhury, Shreyan, Cancino-Chacón, Carlos Eduardo, Liang, Jinhua, Dixon, Simon, Widmer, Gerhard |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.14850 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Expressivity-aware Music Performance Retrieval using Mid-level Perceptual Features and Emotion Word Embeddings
by: Chowdhury, Shreyan, et al.
Published: (2024)
by: Chowdhury, Shreyan, et al.
Published: (2024)
From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano
by: Zhang, Huan, et al.
Published: (2024)
by: Zhang, Huan, et al.
Published: (2024)
Sounding Out Reconstruction Error-Based Evaluation of Generative Models of Expressive Performance
by: Peter, Silvan David, et al.
Published: (2023)
by: Peter, Silvan David, et al.
Published: (2023)
Towards Musically Informed Evaluation of Piano Transcription Models
by: Hu, Patricia, et al.
Published: (2024)
by: Hu, Patricia, et al.
Published: (2024)
RenderBox: Expressive Performance Rendering with Text Control
by: Zhang, Huan, et al.
Published: (2025)
by: Zhang, Huan, et al.
Published: (2025)
Hierarchical Symbolic Pop Music Generation with Graph Neural Networks
by: Lim, Wen Qing, et al.
Published: (2024)
by: Lim, Wen Qing, et al.
Published: (2024)
LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment
by: Zhang, Huan, et al.
Published: (2024)
by: Zhang, Huan, et al.
Published: (2024)
From Aesthetics to Human Preferences: Comparative Perspectives of Evaluating Text-to-Music Systems
by: Zhang, Huan, et al.
Published: (2025)
by: Zhang, Huan, et al.
Published: (2025)
How to Infer Repeat Structures in MIDI Performances
by: Peter, Silvan, et al.
Published: (2025)
by: Peter, Silvan, et al.
Published: (2025)
How does the teacher rate? Observations from the NeuroPiano dataset
by: Zhang, Huan, et al.
Published: (2024)
by: Zhang, Huan, et al.
Published: (2024)
Improving Query-by-Vocal Imitation with Contrastive Learning and Audio Pretraining
by: Greif, Jonathan, et al.
Published: (2024)
by: Greif, Jonathan, et al.
Published: (2024)
Music Boomerang: Reusing Diffusion Models for Data Augmentation and Audio Manipulation
by: Fichtinger, Alexander, et al.
Published: (2025)
by: Fichtinger, Alexander, et al.
Published: (2025)
Emotion-Aware Speech Generation with Character-Specific Voices for Comics
by: Qian, Zhiwen, et al.
Published: (2025)
by: Qian, Zhiwen, et al.
Published: (2025)
Reconstructing the Charlie Parker Omnibook using an audio-to-score automatic transcription pipeline
by: Riley, Xavier, et al.
Published: (2024)
by: Riley, Xavier, et al.
Published: (2024)
Enhanced Automatic Drum Transcription via Drum Stem Source Separation
by: Riley, Xavier, et al.
Published: (2025)
by: Riley, Xavier, et al.
Published: (2025)
Estimating Musical Surprisal from Audio in Autoregressive Diffusion Model Noise Spaces
by: Bjare, Mathias Rose, et al.
Published: (2025)
by: Bjare, Mathias Rose, et al.
Published: (2025)
WavCraft: Audio Editing and Generation with Large Language Models
by: Liang, Jinhua, et al.
Published: (2024)
by: Liang, Jinhua, et al.
Published: (2024)
GAPS: A Large and Diverse Classical Guitar Dataset and Benchmark Transcription Model
by: Riley, Xavier, et al.
Published: (2024)
by: Riley, Xavier, et al.
Published: (2024)
Moonbeam: A MIDI Foundation Model Using Both Absolute and Relative Music Attributes
by: Guo, Zixun, et al.
Published: (2025)
by: Guo, Zixun, et al.
Published: (2025)
SMUG-Explain: A Framework for Symbolic Music Graph Explanations
by: Karystinaios, Emmanouil, et al.
Published: (2024)
by: Karystinaios, Emmanouil, et al.
Published: (2024)
Pairing Real-Time Piano Transcription with Symbol-level Tracking for Precise and Robust Score Following
by: Peter, Silvan, et al.
Published: (2025)
by: Peter, Silvan, et al.
Published: (2025)
TheGlueNote: Learned Representations for Robust and Flexible Note Alignment
by: Peter, Silvan David, et al.
Published: (2024)
by: Peter, Silvan David, et al.
Published: (2024)
Exploring Performance-Complexity Trade-Offs in Sound Event Detection Models
by: Morocutti, Tobias, et al.
Published: (2025)
by: Morocutti, Tobias, et al.
Published: (2025)
Bridging Paintings and Music -- Exploring Emotion based Music Generation through Paintings
by: Hisariya, Tanisha, et al.
Published: (2024)
by: Hisariya, Tanisha, et al.
Published: (2024)
Acoustic Prompt Tuning: Empowering Large Language Models with Audition Capabilities
by: Liang, Jinhua, et al.
Published: (2023)
by: Liang, Jinhua, et al.
Published: (2023)
Multi-Iteration Multi-Stage Fine-Tuning of Transformers for Sound Event Detection with Heterogeneous Datasets
by: Schmid, Florian, et al.
Published: (2024)
by: Schmid, Florian, et al.
Published: (2024)
Language Models for Music Medicine Generation
by: Nikolakakis, Emmanouil, et al.
Published: (2024)
by: Nikolakakis, Emmanouil, et al.
Published: (2024)
Fusing Audio and Metadata Embeddings Improves Language-based Audio Retrieval
by: Primus, Paul, et al.
Published: (2024)
by: Primus, Paul, et al.
Published: (2024)
Are Inherently Interpretable Models More Robust? A Study In Music Emotion Recognition
by: Hoedt, Katharina, et al.
Published: (2025)
by: Hoedt, Katharina, et al.
Published: (2025)
Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection
by: Liang, Jinhua, et al.
Published: (2024)
by: Liang, Jinhua, et al.
Published: (2024)
Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage Training
by: Schmid, Florian, et al.
Published: (2024)
by: Schmid, Florian, et al.
Published: (2024)
RUMAA: Repeat-Aware Unified Music Audio Analysis for Score-Performance Alignment, Transcription, and Mistake Detection
by: Chang, Sungkyun, et al.
Published: (2025)
by: Chang, Sungkyun, et al.
Published: (2025)
High Resolution Guitar Transcription via Domain Adaptation
by: Riley, Xavier, et al.
Published: (2024)
by: Riley, Xavier, et al.
Published: (2024)
Estimated Audio-Caption Correspondences Improve Language-Based Audio Retrieval
by: Primus, Paul, et al.
Published: (2024)
by: Primus, Paul, et al.
Published: (2024)
Beat this! Accurate beat tracking without DBN postprocessing
by: Foscarin, Francesco, et al.
Published: (2024)
by: Foscarin, Francesco, et al.
Published: (2024)
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining
by: Primus, Paul, et al.
Published: (2025)
by: Primus, Paul, et al.
Published: (2025)
Effective Pre-Training of Audio Transformers for Sound Event Detection
by: Schmid, Florian, et al.
Published: (2024)
by: Schmid, Florian, et al.
Published: (2024)
AudioMorphix: Training-free audio editing with diffusion probabilistic models
by: Liang, Jinhua, et al.
Published: (2025)
by: Liang, Jinhua, et al.
Published: (2025)
Toward Natural Emotional Text-To-Speech System with Fine-Grained Non-Verbal Expression Control
by: Zhou, Wangzixi, et al.
Published: (2026)
by: Zhou, Wangzixi, et al.
Published: (2026)
Controlling Surprisal in Music Generation via Information Content Curve Matching
by: Bjare, Mathias Rose, et al.
Published: (2024)
by: Bjare, Mathias Rose, et al.
Published: (2024)
Similar Items
-
Expressivity-aware Music Performance Retrieval using Mid-level Perceptual Features and Emotion Word Embeddings
by: Chowdhury, Shreyan, et al.
Published: (2024) -
From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano
by: Zhang, Huan, et al.
Published: (2024) -
Sounding Out Reconstruction Error-Based Evaluation of Generative Models of Expressive Performance
by: Peter, Silvan David, et al.
Published: (2023) -
Towards Musically Informed Evaluation of Piano Transcription Models
by: Hu, Patricia, et al.
Published: (2024) -
RenderBox: Expressive Performance Rendering with Text Control
by: Zhang, Huan, et al.
Published: (2025)