Saved in:
| Main Authors: | Li, Jiajun, Xu, Tianze, Chen, Xuesong, Yao, Xinrui, Liu, Shuchang |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.02801 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
M$^{2}$UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models
by: Liu, Shansong, et al.
Published: (2023)
by: Liu, Shansong, et al.
Published: (2023)
MuMu-LLaMA: Multi-modal Music Understanding and Generation via Large Language Models
by: Liu, Shansong, et al.
Published: (2024)
by: Liu, Shansong, et al.
Published: (2024)
Large-Scale Training Data Attribution for Music Generative Models via Unlearning
by: Choi, Woosung, et al.
Published: (2025)
by: Choi, Woosung, et al.
Published: (2025)
Music Source Separation Based on a Lightweight Deep Learning Framework (DTTNET: DUAL-PATH TFC-TDF UNET)
by: Chen, Junyu, et al.
Published: (2023)
by: Chen, Junyu, et al.
Published: (2023)
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
by: Bai, Ye, et al.
Published: (2024)
by: Bai, Ye, et al.
Published: (2024)
Temporal Adaptation of Pre-trained Foundation Models for Music Structure Analysis
by: Zhang, Yixiao, et al.
Published: (2025)
by: Zhang, Yixiao, et al.
Published: (2025)
PianoBART: Symbolic Piano Music Generation and Understanding with Large-Scale Pre-Training
by: Liang, Xiao, et al.
Published: (2024)
by: Liang, Xiao, et al.
Published: (2024)
Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model
by: Karchkhadze, Tornike, et al.
Published: (2024)
by: Karchkhadze, Tornike, et al.
Published: (2024)
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms
by: Wang, Yashan, et al.
Published: (2025)
by: Wang, Yashan, et al.
Published: (2025)
From Generality to Mastery: Composer-Style Symbolic Music Generation via Large-Scale Pre-training
by: Yao, Mingyang, et al.
Published: (2025)
by: Yao, Mingyang, et al.
Published: (2025)
Melodia: Training-Free Music Editing Guided by Attention Probing in Diffusion Models
by: Yang, Yi, et al.
Published: (2025)
by: Yang, Yi, et al.
Published: (2025)
OMAR-RQ: Open Music Audio Representation Model Trained with Multi-Feature Masked Token Prediction
by: Alonso-Jiménez, Pablo, et al.
Published: (2025)
by: Alonso-Jiménez, Pablo, et al.
Published: (2025)
Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion Models
by: Karchkhadze, Tornike, et al.
Published: (2024)
by: Karchkhadze, Tornike, et al.
Published: (2024)
MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation
by: Liu, Cheng, et al.
Published: (2025)
by: Liu, Cheng, et al.
Published: (2025)
A Lightweight Slot-Attention Framework for Multi-Instrument Multi-Pitch Estimation
by: Taenzer, Michael
Published: (2026)
by: Taenzer, Michael
Published: (2026)
MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners
by: Tsai, Fang-Duo, et al.
Published: (2025)
by: Tsai, Fang-Duo, et al.
Published: (2025)
MusicMamba: A Dual-Feature Modeling Approach for Generating Chinese Traditional Music with Modal Precision
by: Chen, Jiatao, et al.
Published: (2024)
by: Chen, Jiatao, et al.
Published: (2024)
Language Models for Music Medicine Generation
by: Nikolakakis, Emmanouil, et al.
Published: (2024)
by: Nikolakakis, Emmanouil, et al.
Published: (2024)
Speaker Disentanglement of Speech Pre-trained Model Based on Interpretability
by: Zhu, Xiaoxu, et al.
Published: (2025)
by: Zhu, Xiaoxu, et al.
Published: (2025)
ACE-Step: A Step Towards Music Generation Foundation Model
by: Gong, Junmin, et al.
Published: (2025)
by: Gong, Junmin, et al.
Published: (2025)
FakeMusicCaps: a Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models
by: Comanducci, Luca, et al.
Published: (2024)
by: Comanducci, Luca, et al.
Published: (2024)
Watermarking Training Data of Music Generation Models
by: Epple, Pascal, et al.
Published: (2024)
by: Epple, Pascal, et al.
Published: (2024)
MINT: Boosting Audio-Language Model via Multi-Target Pre-Training and Instruction Tuning
by: Zhao, Hang, et al.
Published: (2024)
by: Zhao, Hang, et al.
Published: (2024)
A Diffusion-Based Generative Equalizer for Music Restoration
by: Moliner, Eloi, et al.
Published: (2024)
by: Moliner, Eloi, et al.
Published: (2024)
Large Language Models: From Notes to Musical Form
by: Atassi, Lilac
Published: (2024)
by: Atassi, Lilac
Published: (2024)
Multi-modal Speech Enhancement with Limited Electromyography Channels
by: Feng, Fuyuan, et al.
Published: (2025)
by: Feng, Fuyuan, et al.
Published: (2025)
MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music Generation through Pre-Training and Counterfactual Loss
by: Shu, Yangyang, et al.
Published: (2024)
by: Shu, Yangyang, et al.
Published: (2024)
Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models
by: Chen, Li-Wei, et al.
Published: (2024)
by: Chen, Li-Wei, et al.
Published: (2024)
Training a Perceptual Model for Evaluating Auditory Similarity in Music Adversarial Attack
by: Liu, Yuxuan, et al.
Published: (2025)
by: Liu, Yuxuan, et al.
Published: (2025)
TouchTTS: An Embarrassingly Simple TTS Framework that Everyone Can Touch
by: Song, Xingchen, et al.
Published: (2024)
by: Song, Xingchen, et al.
Published: (2024)
Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning
by: Tsai, Fang-Duo, et al.
Published: (2024)
by: Tsai, Fang-Duo, et al.
Published: (2024)
Adapter-Based Multi-Agent AVSR Extension for Pre-Trained ASR Models
by: Simic, Christopher, et al.
Published: (2025)
by: Simic, Christopher, et al.
Published: (2025)
Improving Controllability and Editability for Pretrained Text-to-Music Generation Models
by: Zhang, Yixiao
Published: (2024)
by: Zhang, Yixiao
Published: (2024)
The Arrow of Time in Music -- Revisiting the Temporal Structure of Music with Distinguishability and Unique Orientability as the Anchor Point
by: Xu, Qi
Published: (2023)
by: Xu, Qi
Published: (2023)
Bridging the Gap Between Semantic and User Preference Spaces for Multi-modal Music Representation Learning
by: Pan, Xiaofeng, et al.
Published: (2025)
by: Pan, Xiaofeng, et al.
Published: (2025)
MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition
by: Mu, Bingshen, et al.
Published: (2024)
by: Mu, Bingshen, et al.
Published: (2024)
Let Network Decide What to Learn: Symbolic Music Understanding Model Based on Large-scale Adversarial Pre-training
by: Zhao, Zijian
Published: (2024)
by: Zhao, Zijian
Published: (2024)
MusicAOG: an Energy-Based Model for Learning and Sampling a Hierarchical Representation of Symbolic Music
by: Qian, Yikai, et al.
Published: (2024)
by: Qian, Yikai, et al.
Published: (2024)
Multi-Source Music Generation with Latent Diffusion
by: Xu, Zhongweiyang, et al.
Published: (2024)
by: Xu, Zhongweiyang, et al.
Published: (2024)
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech
by: Shi, Jiatong, et al.
Published: (2024)
by: Shi, Jiatong, et al.
Published: (2024)
Similar Items
-
M$^{2}$UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models
by: Liu, Shansong, et al.
Published: (2023) -
MuMu-LLaMA: Multi-modal Music Understanding and Generation via Large Language Models
by: Liu, Shansong, et al.
Published: (2024) -
Large-Scale Training Data Attribution for Music Generative Models via Unlearning
by: Choi, Woosung, et al.
Published: (2025) -
Music Source Separation Based on a Lightweight Deep Learning Framework (DTTNET: DUAL-PATH TFC-TDF UNET)
by: Chen, Junyu, et al.
Published: (2023) -
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
by: Bai, Ye, et al.
Published: (2024)