Saved in:
| Main Authors: | Wang, Ting-Kang, Peng, Yueh-Po, Su, Li, Cheung, Vincent K. M. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.23759 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Is Transfer Learning Necessary for Violin Transcription?
by: Peng, Yueh-Po, et al.
Published: (2025)
by: Peng, Yueh-Po, et al.
Published: (2025)
ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend Conditioning
by: Kim, Daewoong, et al.
Published: (2024)
by: Kim, Daewoong, et al.
Published: (2024)
Music Transcription with (Almost) No Supervision
by: Shin, Saebyeol, et al.
Published: (2026)
by: Shin, Saebyeol, et al.
Published: (2026)
Machine Learning Techniques in Automatic Music Transcription: A Systematic Survey
by: Jamshidi, Fatemeh, et al.
Published: (2024)
by: Jamshidi, Fatemeh, et al.
Published: (2024)
Count The Notes: Histogram-Based Supervision for Automatic Music Transcription
by: Yaffe, Jonathan, et al.
Published: (2025)
by: Yaffe, Jonathan, et al.
Published: (2025)
Towards Robust Transcription: Exploring Noise Injection Strategies for Training Data Augmentation
by: Kim, Yonghyun, et al.
Published: (2024)
by: Kim, Yonghyun, et al.
Published: (2024)
Scaling to Multimodal and Multichannel Heart Sound Classification with Synthetic and Augmented Biosignals
by: Marocchi, Milan, et al.
Published: (2025)
by: Marocchi, Milan, et al.
Published: (2025)
A Data-Driven Analysis of Robust Automatic Piano Transcription
by: Edwards, Drew, et al.
Published: (2024)
by: Edwards, Drew, et al.
Published: (2024)
Lyrics Transcription for Humans: A Readability-Aware Benchmark
by: Cífka, Ondřej, et al.
Published: (2024)
by: Cífka, Ondřej, et al.
Published: (2024)
Synthetic Data Augmentation for Medical Audio Classification: A Preliminary Evaluation
by: McShannon, David, et al.
Published: (2026)
by: McShannon, David, et al.
Published: (2026)
Adversarial Data Augmentation for Robust Speaker Verification
by: Zhou, Zhenyu, et al.
Published: (2024)
by: Zhou, Zhenyu, et al.
Published: (2024)
Underwater Acoustic Target Recognition based on Smoothness-inducing Regularization and Spectrogram-based Data Augmentation
by: Xu, Ji, et al.
Published: (2023)
by: Xu, Ji, et al.
Published: (2023)
Sound and Music Biases in Deep Music Transcription Models: A Systematic Analysis
by: Marták, Lukáš Samuel, et al.
Published: (2025)
by: Marták, Lukáš Samuel, et al.
Published: (2025)
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
by: Chang, Sungkyun, et al.
Published: (2024)
by: Chang, Sungkyun, et al.
Published: (2024)
SAMUeL: Efficient Vocal-Conditioned Music Generation via Soft Alignment Attention and Latent Diffusion
by: Cheung, Hei Shing, et al.
Published: (2025)
by: Cheung, Hei Shing, et al.
Published: (2025)
A Study on Synthesizing Expressive Violin Performances: Approaches and Comparisons
by: Hung, Tzu-Yun, et al.
Published: (2024)
by: Hung, Tzu-Yun, et al.
Published: (2024)
Heterogeneity-Aware Dataset Scheduling for Efficient Audio Large Language Model Training
by: Wu, Yanru, et al.
Published: (2026)
by: Wu, Yanru, et al.
Published: (2026)
Music Genre Classification Using Machine Learning Techniques
by: Mishra, Alokit, et al.
Published: (2025)
by: Mishra, Alokit, et al.
Published: (2025)
High Resolution Guitar Transcription via Domain Adaptation
by: Riley, Xavier, et al.
Published: (2024)
by: Riley, Xavier, et al.
Published: (2024)
RUMAA: Repeat-Aware Unified Music Audio Analysis for Score-Performance Alignment, Transcription, and Mistake Detection
by: Chang, Sungkyun, et al.
Published: (2025)
by: Chang, Sungkyun, et al.
Published: (2025)
Targeted Augmented Data for Audio Deepfake Detection
by: Astrid, Marcella, et al.
Published: (2024)
by: Astrid, Marcella, et al.
Published: (2024)
Quantifying the Corpus Bias Problem in Automatic Music Transcription Systems
by: Marták, Lukáš Samuel, et al.
Published: (2024)
by: Marták, Lukáš Samuel, et al.
Published: (2024)
HuLA: Prosody-Aware Anti-Spoofing with Multi-Task Learning for Expressive and Emotional Synthetic Speech
by: Mahapatra, Aurosweta, et al.
Published: (2025)
by: Mahapatra, Aurosweta, et al.
Published: (2025)
Meta-Learning-Based Delayless Subband Adaptive Filter using Complex Self-Attention for Active Noise Control
by: Feng, Pengxing, et al.
Published: (2024)
by: Feng, Pengxing, et al.
Published: (2024)
Conditional Generative Data Augmentation for Clinical Audio Datasets
by: Seibold, Matthias, et al.
Published: (2022)
by: Seibold, Matthias, et al.
Published: (2022)
Noise-to-Notes: Diffusion-based Generation and Refinement for Automatic Drum Transcription
by: Yeung, Michael, et al.
Published: (2025)
by: Yeung, Michael, et al.
Published: (2025)
Exploring System Adaptations For Minimum Latency Real-Time Piano Transcription
by: Hu, Patricia, et al.
Published: (2025)
by: Hu, Patricia, et al.
Published: (2025)
Moonshine: Speech Recognition for Live Transcription and Voice Commands
by: Jeffries, Nat, et al.
Published: (2024)
by: Jeffries, Nat, et al.
Published: (2024)
Early Attentive Sparsification Accelerates Neural Speech Transcription
by: Xu, Zifei, et al.
Published: (2025)
by: Xu, Zifei, et al.
Published: (2025)
Beyond Transcription: Mechanistic Interpretability in ASR
by: Glazer, Neta, et al.
Published: (2025)
by: Glazer, Neta, et al.
Published: (2025)
Scoring Time Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription
by: Yan, Yujia, et al.
Published: (2024)
by: Yan, Yujia, et al.
Published: (2024)
Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription
by: Cwitkowitz, Frank, et al.
Published: (2023)
by: Cwitkowitz, Frank, et al.
Published: (2023)
Towards Efficient and Real-Time Piano Transcription Using Neural Autoregressive Models
by: Kwon, Taegyun, et al.
Published: (2024)
by: Kwon, Taegyun, et al.
Published: (2024)
SALSA-V: Shortcut-Augmented Long-form Synchronized Audio from Videos
by: Dellali, Amir, et al.
Published: (2025)
by: Dellali, Amir, et al.
Published: (2025)
Piano Transcription by Hierarchical Language Modeling with Pretrained Roll-based Encoders
by: Li, Dichucheng, et al.
Published: (2025)
by: Li, Dichucheng, et al.
Published: (2025)
AMT-APC: Automatic Piano Cover by Fine-Tuning an Automatic Music Transcription Model
by: Komiya, Kazuma, et al.
Published: (2024)
by: Komiya, Kazuma, et al.
Published: (2024)
Semantic-Aware Confidence Calibration for Automated Audio Captioning
by: Dunker, Lucas, et al.
Published: (2025)
by: Dunker, Lucas, et al.
Published: (2025)
Music Boomerang: Reusing Diffusion Models for Data Augmentation and Audio Manipulation
by: Fichtinger, Alexander, et al.
Published: (2025)
by: Fichtinger, Alexander, et al.
Published: (2025)
CAARMA: Class Augmentation with Adversarial Mixup Regularization
by: Baali, Massa, et al.
Published: (2025)
by: Baali, Massa, et al.
Published: (2025)
Scaling Ambiguity: Augmenting Human Annotation in Speech Emotion Recognition with Audio-Language Models
by: Zhang, Wenda, et al.
Published: (2026)
by: Zhang, Wenda, et al.
Published: (2026)
Similar Items
-
Is Transfer Learning Necessary for Violin Transcription?
by: Peng, Yueh-Po, et al.
Published: (2025) -
ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend Conditioning
by: Kim, Daewoong, et al.
Published: (2024) -
Music Transcription with (Almost) No Supervision
by: Shin, Saebyeol, et al.
Published: (2026) -
Machine Learning Techniques in Automatic Music Transcription: A Systematic Survey
by: Jamshidi, Fatemeh, et al.
Published: (2024) -
Count The Notes: Histogram-Based Supervision for Automatic Music Transcription
by: Yaffe, Jonathan, et al.
Published: (2025)