Saved in:
| Main Authors: | Choudhary, Yash, Rao, Preeti, Bhattacharyya, Pushpak |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.06259 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Lyrics Matter: Exploiting the Power of Learnt Representations for Music Popularity Prediction
by: Choudhary, Yash, et al.
Published: (2025)
by: Choudhary, Yash, et al.
Published: (2025)
Guitar Chord Diagram Suggestion for Western Popular Music
by: d'Hooge, Alexandre, et al.
Published: (2024)
by: d'Hooge, Alexandre, et al.
Published: (2024)
MusicLIME: Explainable Multimodal Music Understanding
by: Sotirou, Theodoros, et al.
Published: (2024)
by: Sotirou, Theodoros, et al.
Published: (2024)
GaMMA: Towards Joint Global-Temporal Music Understanding in Large Multimodal Models
by: You, Zuyao, et al.
Published: (2026)
by: You, Zuyao, et al.
Published: (2026)
Survey on the Evaluation of Generative Models in Music
by: Lerch, Alexander, et al.
Published: (2025)
by: Lerch, Alexander, et al.
Published: (2025)
Depth-Structured Music Recurrence: Budgeted Recurrent Attention for Full-Piece Symbolic Music Modeling
by: Yi, Yungang, et al.
Published: (2026)
by: Yi, Yungang, et al.
Published: (2026)
Melodic and Metrical Elements of Expressiveness in Hindustani Vocal Music
by: Bhake, Yash, et al.
Published: (2025)
by: Bhake, Yash, et al.
Published: (2025)
CSyMR: Benchmarking Compositional Music Information Retrieval in Symbolic Music Reasoning
by: Wang, Boyang, et al.
Published: (2025)
by: Wang, Boyang, et al.
Published: (2025)
MoST: Mixing Speech and Text with Modality-Aware Mixture of Experts
by: Lou, Yuxuan, et al.
Published: (2026)
by: Lou, Yuxuan, et al.
Published: (2026)
MuseCPBench: an Empirical Study of Music Editing Methods through Music Context Preservation
by: Vishe, Yash, et al.
Published: (2025)
by: Vishe, Yash, et al.
Published: (2025)
Multimodal Audio-based Disease Prediction with Transformer-based Hierarchical Fusion Network
by: Cai, Jinjin, et al.
Published: (2024)
by: Cai, Jinjin, et al.
Published: (2024)
ConceptCaps: a Distilled Concept Dataset for Interpretability in Music Models
by: Sienkiewicz, Bruno, et al.
Published: (2026)
by: Sienkiewicz, Bruno, et al.
Published: (2026)
Live Music Diffusion Models: Efficient Fine-Tuning and Post-Training of Interactive Diffusion Music Generators
by: Novack, Zachary, et al.
Published: (2026)
by: Novack, Zachary, et al.
Published: (2026)
Memo2496: Expert-Annotated Dataset and Dual-View Adaptive Framework for Music Emotion Recognition
by: Li, Qilin, et al.
Published: (2025)
by: Li, Qilin, et al.
Published: (2025)
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning
by: Zhang, Yixiao, et al.
Published: (2024)
by: Zhang, Yixiao, et al.
Published: (2024)
Aria-MIDI: A Dataset of Piano MIDI Files for Symbolic Music Modeling
by: Bradshaw, Louis, et al.
Published: (2025)
by: Bradshaw, Louis, et al.
Published: (2025)
LAPS-Diff: A Diffusion-Based Framework for Singing Voice Synthesis With Language Aware Prosody-Style Guided Learning
by: Dhar, Sandipan, et al.
Published: (2025)
by: Dhar, Sandipan, et al.
Published: (2025)
Timing Matters: Enhancing User Experience through Temporal Prediction in Smart Homes
by: Ganatra, Shrey, et al.
Published: (2024)
by: Ganatra, Shrey, et al.
Published: (2024)
Modality-Invariant Bidirectional Temporal Representation Distillation Network for Missing Multimodal Sentiment Analysis
by: Wang, Xincheng, et al.
Published: (2025)
by: Wang, Xincheng, et al.
Published: (2025)
Modality-Specific Speech Enhancement and Noise-Adaptive Fusion for Acoustic and Body-Conduction Microphone Framework
by: Kim, Yunsik, et al.
Published: (2025)
by: Kim, Yunsik, et al.
Published: (2025)
CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction
by: Ma, Yinghao, et al.
Published: (2026)
by: Ma, Yinghao, et al.
Published: (2026)
Myna: Masking-Based Contrastive Learning of Musical Representations
by: Yonay, Ori, et al.
Published: (2025)
by: Yonay, Ori, et al.
Published: (2025)
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
by: Li, Shuyu, et al.
Published: (2025)
by: Li, Shuyu, et al.
Published: (2025)
Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation
by: Wu, Junda, et al.
Published: (2024)
by: Wu, Junda, et al.
Published: (2024)
Music for All: Representational Bias and Cross-Cultural Adaptability of Music Generation Models
by: Mehta, Atharva, et al.
Published: (2025)
by: Mehta, Atharva, et al.
Published: (2025)
DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation
by: Novack, Zachary, et al.
Published: (2024)
by: Novack, Zachary, et al.
Published: (2024)
Fusion Segment Transformer: Bi-Directional Attention Guided Fusion Network for AI-Generated Music Detection
by: Kim, Yumin, et al.
Published: (2026)
by: Kim, Yumin, et al.
Published: (2026)
SAGE-Music: Low-Latency Symbolic Music Generation via Attribute-Specialized Key-Value Head Sharing
by: Tan, Jiaye, et al.
Published: (2025)
by: Tan, Jiaye, et al.
Published: (2025)
Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation
by: Agarwal, Manvi, et al.
Published: (2025)
by: Agarwal, Manvi, et al.
Published: (2025)
WhisQ: Cross-Modal Representation Learning for Text-to-Music MOS Prediction
by: Emon, Jakaria Islam, et al.
Published: (2025)
by: Emon, Jakaria Islam, et al.
Published: (2025)
Fusing Memory and Attention: A study on LSTM, Transformer and Hybrid Architectures for Symbolic Music Generation
by: Ghoshal, Soudeep, et al.
Published: (2026)
by: Ghoshal, Soudeep, et al.
Published: (2026)
Do Music Generation Models Encode Music Theory?
by: Wei, Megan, et al.
Published: (2024)
by: Wei, Megan, et al.
Published: (2024)
FISHER: A Foundation Model for Multi-Modal Industrial Signal Comprehensive Representation
by: Fan, Pingyi, et al.
Published: (2025)
by: Fan, Pingyi, et al.
Published: (2025)
An Independence-promoting Loss for Music Generation with Language Models
by: Lemercier, Jean-Marie, et al.
Published: (2024)
by: Lemercier, Jean-Marie, et al.
Published: (2024)
Latent Space Disentanglement via Activation Steering for Interpretable Attribute Control in Symbolic Music Generation
by: Prokopiou, Ioannis, et al.
Published: (2026)
by: Prokopiou, Ioannis, et al.
Published: (2026)
TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion
by: Pegg, Samuel, et al.
Published: (2024)
by: Pegg, Samuel, et al.
Published: (2024)
Music Source Restoration
by: Zang, Yongyi, et al.
Published: (2025)
by: Zang, Yongyi, et al.
Published: (2025)
AMB-DSGDN: Adaptive Modality-Balanced Dynamic Semantic Graph Differential Network for Multimodal Emotion Recognition
by: Wang, Yunsheng, et al.
Published: (2026)
by: Wang, Yunsheng, et al.
Published: (2026)
MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal Music Processing
by: Huang, Yu-Fen, et al.
Published: (2024)
by: Huang, Yu-Fen, et al.
Published: (2024)
CloserMusicDB: A Modern Multipurpose Dataset of High Quality Music
by: Piekarzewicz, Aleksandra, et al.
Published: (2024)
by: Piekarzewicz, Aleksandra, et al.
Published: (2024)
Similar Items
-
Lyrics Matter: Exploiting the Power of Learnt Representations for Music Popularity Prediction
by: Choudhary, Yash, et al.
Published: (2025) -
Guitar Chord Diagram Suggestion for Western Popular Music
by: d'Hooge, Alexandre, et al.
Published: (2024) -
MusicLIME: Explainable Multimodal Music Understanding
by: Sotirou, Theodoros, et al.
Published: (2024) -
GaMMA: Towards Joint Global-Temporal Music Understanding in Large Multimodal Models
by: You, Zuyao, et al.
Published: (2026) -
Survey on the Evaluation of Generative Models in Music
by: Lerch, Alexander, et al.
Published: (2025)