:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ma, Yinghao, Øland, Anders, Ragni, Anton, Del Sette, Bleiz MacSen, Saitis, Charalampos, Donahue, Chris, Lin, Chenghua, Plachouras, Christos, Benetos, Emmanouil, Shatri, Elona, Morreale, Fabio, Zhang, Ge, Fazekas, György, Xia, Gus, Zhang, Huan, Manco, Ilaria, Huang, Jiawen, Guinot, Julien, Lin, Liwei, Marinelli, Luca, Lam, Max W. Y., Sharma, Megha, Kong, Qiuqiang, Dannenberg, Roger B., Yuan, Ruibin, Wu, Shangda, Wu, Shih-Lun, Dai, Shuqi, Lei, Shun, Kang, Shiyin, Dixon, Simon, Chen, Wenhu, Huang, Wenhao, Du, Xingjian, Qu, Xingwei, Tan, Xu, Li, Yizhi, Tian, Zeyue, Wu, Zhiyong, Wu, Zhizheng, Ma, Ziyang, Wang, Ziyu
Format:	Preprint
Published:	2024
Subjects:	Sound Artificial Intelligence Computation and Language Machine Learning Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2408.14340
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Knowledge Discovery in Optical Music Recognition: Enhancing Information Retrieval with Instance Segmentation
by: Shatri, Elona, et al.
Published: (2024)

Low-Data Classification of Historical Music Manuscripts: A Few-Shot Learning Approach
by: Shatri, Elona, et al.
Published: (2024)

Synthesising Handwritten Music with GANs: A Comprehensive Evaluation of CycleWGAN, ProGAN, and DCGAN
by: Shatri, Elona, et al.
Published: (2024)

Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks
by: Plachouras, Christos, et al.
Published: (2025)

Learning Music Audio Representations With Limited Data
by: Plachouras, Christos, et al.
Published: (2025)

Proceedings of the 6th International Workshop on Reading Music Systems
by: Calvo-Zaragoza, Jorge, et al.
Published: (2024)

Automatic Melody Reduction via Shortest Path Finding
by: Wang, Ziyu, et al.
Published: (2025)

Composer Style-specific Symbolic Music Generation Using Vector Quantized Discrete Diffusion Models
by: Zhang, Jincheng, et al.
Published: (2023)

Mamba-Diffusion Model with Learnable Wavelet for Controllable Symbolic Music Generation
by: Zhang, Jincheng, et al.
Published: (2025)

Audio synthesizer inversion in symmetric parameter spaces with approximately equivariant flow matching
by: Hayes, Ben, et al.
Published: (2025)

MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models
by: Weck, Benno, et al.
Published: (2024)

ChatMusician: Understanding and Generating Music Intrinsically with LLM
by: Yuan, Ruibin, et al.
Published: (2024)

Voices of Civilizations: A Multilingual QA Benchmark for Global Music Understanding
by: Wu, Shangda, et al.
Published: (2026)

PUBLIC AND POSTCOLONIAL PRACTICES IN LATIN AMERICAN ARCHAEOLOGY: ENGAGING WITH NON-DESCENDANT COMMUNITIES IN NORTHERN BELIZE
by: Maxine Oland
Published: (2012)

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training
by: Li, Yizhi, et al.
Published: (2023)

GD-Retriever: Controllable Generative Text-Music Retrieval with Diffusion Models
by: Guinot, Julien, et al.
Published: (2025)

Leave-One-EquiVariant: Alleviating invariance-related information loss in contrastive music representations
by: Guinot, Julien, et al.
Published: (2024)

Semi-Supervised Contrastive Learning of Musical Representations
by: Guinot, Julien, et al.
Published: (2024)

Exploring Tokenization Methods for Multitrack Sheet Music Generation
by: Wang, Yashan, et al.
Published: (2024)

PSR J0952-0607: Probing the Stiffest Equations of State and r-Mode Suppression Mechanisms
by: Wu, Zeyue, et al.
Published: (2025)

SLAP: Siamese Language-Audio Pretraining Without Negative Samples for Music Understanding
by: Guinot, Julien, et al.
Published: (2025)

CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages
by: Wu, Shangda, et al.
Published: (2025)

PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
by: Xie, Zeyu, et al.
Published: (2024)

AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
by: Xie, Zeyu, et al.
Published: (2024)

A Holistic Evaluation of Piano Sound Quality
by: Zhou, Monan, et al.
Published: (2023)

MelodyT5: A Unified Score-to-Score Transformer for Symbolic Music Processing
by: Wu, Shangda, et al.
Published: (2024)

OPEN SCIENCE POLICIES SEEN FROM THE PERSPECTIVE OF RESEARCH COMMUNITIES: THE CASE OF PERU
by: Manco, Alejandra, et al.
Published: (2023)

MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response
by: Deng, Zihao, et al.
Published: (2023)

Swallowing the Poison Pills: Insights from Vulnerability Disparity Among LLMs
by: Yifeng, Peng, et al.
Published: (2025)

RESILIENCE IN THE CONTEXT OF A DISASTER: FINANCIAL STRATEGIES FOR COPING WITH DISASTERS
by: Pojani, Elona
Published: (2025)

Sozial-ökologische Krise und kollektives Landeigentum
by: Dannenberg, Janina
Published: (2024)

Inference-time Scaling for Diffusion-based Audio Super-resolution
by: Jin, Yizhu, et al.
Published: (2025)

Apreciación lectora de La Moschea de José de Villaviciosa
by: Margherita Morreale
Published: (2005)

SAR-LM: Symbolic Audio Reasoning with Large Language Models
by: Taheri, Termeh, et al.
Published: (2025)

VAInpaint: Zero-Shot Video-Audio inpainting framework with LLMs-driven Module
by: Wu, Kam Man, et al.
Published: (2025)

FlexiVoice: Enabling Flexible Style Control in Zero-Shot TTS with Natural Language Instructions
by: Chen, Dekun, et al.
Published: (2026)

MusiScene: Leveraging MU-LLaMA for Scene Imagination and Enhanced Video Background Music Generation
by: Izzati, Fathinah, et al.
Published: (2025)

Unsupervised Disentanglement of Content and Style via Variance-Invariance Constraints
by: Wu, Yuxuan, et al.
Published: (2024)

Improving Power Generation in Rigid‐Wing Groundgen Airborne Wind Energy Systems Using Feedback Control—A Parametric Study
by: Duc H. Nguyen, et al.
Published: (2026)

Beyond Language Models: Byte Models are Digital World Simulators
by: Wu, Shangda, et al.
Published: (2024)