Saved in:
| Main Authors: | Batra, Arnesh, Sharma, Dev, Thukral, Krish, Bhatia, Ruhani, Batra, Naman, Gautam, Aditya |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.00621 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Missing Melodies: AI Music Generation and its "Nearly" Complete Omission of the Global South
by: Mehta, Atharva, et al.
Published: (2024)
by: Mehta, Atharva, et al.
Published: (2024)
MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection
by: Lu, Tongyu, et al.
Published: (2025)
by: Lu, Tongyu, et al.
Published: (2025)
Exploring Machine Learning and Language Models for Multimodal Depression Detection
by: Hong, Javier Si Zhao, et al.
Published: (2025)
by: Hong, Javier Si Zhao, et al.
Published: (2025)
Melody-Guided Music Generation
by: Wei, Shaopeng, et al.
Published: (2024)
by: Wei, Shaopeng, et al.
Published: (2024)
SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition
by: Ding, Shuangrui, et al.
Published: (2024)
by: Ding, Shuangrui, et al.
Published: (2024)
MusicAIR: A Multimodal AI Music Generation Framework Powered by an Algorithm-Driven Core
by: Liao, Callie C., et al.
Published: (2025)
by: Liao, Callie C., et al.
Published: (2025)
Story2MIDI: Emotionally Aligned Music Generation from Text
by: Shokri, Mohammad, et al.
Published: (2025)
by: Shokri, Mohammad, et al.
Published: (2025)
EDMFormer: Genre-Specific Self-Supervised Learning for Music Structure Segmentation
by: Sajeer, Sahal, et al.
Published: (2026)
by: Sajeer, Sahal, et al.
Published: (2026)
Generative Artificial Intelligence, Musical Heritage and the Construction of Peace Narratives: A Case Study in Mali
by: Coulibaly, Nouhoum, et al.
Published: (2026)
by: Coulibaly, Nouhoum, et al.
Published: (2026)
Cross-Modal Learning for Music-to-Music-Video Description Generation
by: Mao, Zhuoyuan, et al.
Published: (2025)
by: Mao, Zhuoyuan, et al.
Published: (2025)
Linear Complexity Self-Supervised Learning for Music Understanding with Random Quantizer
by: Vavaroutsos, Petros, et al.
Published: (2026)
by: Vavaroutsos, Petros, et al.
Published: (2026)
MindMelody: A Closed-Loop EEG-Driven System for Personalized Music Intervention
by: Zhang, Yimeng, et al.
Published: (2026)
by: Zhang, Yimeng, et al.
Published: (2026)
Music for All: Representational Bias and Cross-Cultural Adaptability of Music Generation Models
by: Mehta, Atharva, et al.
Published: (2025)
by: Mehta, Atharva, et al.
Published: (2025)
YingMusic-Singer: Zero-shot Singing Voice Synthesis and Editing with Annotation-free Melody Guidance
by: Zheng, Junjie, et al.
Published: (2025)
by: Zheng, Junjie, et al.
Published: (2025)
StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning
by: Zhang, Shaolei, et al.
Published: (2024)
by: Zhang, Shaolei, et al.
Published: (2024)
Streaming Speaker Change Detection and Gender Classification for Transducer-Based Multi-Talker Speech Translation
by: Wang, Peidong, et al.
Published: (2025)
by: Wang, Peidong, et al.
Published: (2025)
FLM-Audio: Natural Monologues Improves Native Full-Duplex Chatbots via Dual Training
by: Yao, Yiqun, et al.
Published: (2025)
by: Yao, Yiqun, et al.
Published: (2025)
Musical ethnocentrism in Large Language Models
by: Kruspe, Anna
Published: (2025)
by: Kruspe, Anna
Published: (2025)
InspireMusic: Integrating Super Resolution and Large Language Model for High-Fidelity Long-Form Music Generation
by: Zhang, Chong, et al.
Published: (2025)
by: Zhang, Chong, et al.
Published: (2025)
Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning
by: Manco, Ilaria, et al.
Published: (2024)
by: Manco, Ilaria, et al.
Published: (2024)
Synthetic Audio Helps for Cognitive State Tasks
by: Soubki, Adil, et al.
Published: (2025)
by: Soubki, Adil, et al.
Published: (2025)
StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection
by: Papi, Sara, et al.
Published: (2024)
by: Papi, Sara, et al.
Published: (2024)
Audio Contrastive-based Fine-tuning: Decoupling Representation Learning and Classification
by: Wang, Yang, et al.
Published: (2023)
by: Wang, Yang, et al.
Published: (2023)
Sing it, Narrate it: Quality Musical Lyrics Translation
by: Ye, Zhuorui, et al.
Published: (2024)
by: Ye, Zhuorui, et al.
Published: (2024)
AI-Generated Song Detection via Lyrics Transcripts
by: Frohmann, Markus, et al.
Published: (2025)
by: Frohmann, Markus, et al.
Published: (2025)
Efficient Streaming LLM for Speech Recognition
by: Jia, Junteng, et al.
Published: (2024)
by: Jia, Junteng, et al.
Published: (2024)
Nwāchā Munā: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR
by: Sharma, Rishikesh Kumar, et al.
Published: (2026)
by: Sharma, Rishikesh Kumar, et al.
Published: (2026)
Do Music Generation Models Encode Music Theory?
by: Wei, Megan, et al.
Published: (2024)
by: Wei, Megan, et al.
Published: (2024)
WavSLM: Single-Stream Speech Language Modeling via WavLM Distillation
by: Della Libera, Luca, et al.
Published: (2026)
by: Della Libera, Luca, et al.
Published: (2026)
Bob's Confetti: Phonetic Memorization Attacks in Music and Video Generation
by: Roh, Jaechul, et al.
Published: (2025)
by: Roh, Jaechul, et al.
Published: (2025)
Text2midi: Generating Symbolic Music from Captions
by: Bhandari, Keshav, et al.
Published: (2024)
by: Bhandari, Keshav, et al.
Published: (2024)
Bona fide Cross Testing Reveals Weak Spot in Audio Deepfake Detection Systems
by: Kwok, Chin Yuen, et al.
Published: (2025)
by: Kwok, Chin Yuen, et al.
Published: (2025)
Do Compact SSL Backbones Matter for Audio Deepfake Detection? A Controlled Study with RAPTOR
by: Kulkarni, Ajinkya, et al.
Published: (2026)
by: Kulkarni, Ajinkya, et al.
Published: (2026)
MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response
by: Deng, Zihao, et al.
Published: (2023)
by: Deng, Zihao, et al.
Published: (2023)
DeepResonance: Enhancing Multimodal Music Understanding via Music-centric Multi-way Instruction Tuning
by: Mao, Zhuoyuan, et al.
Published: (2025)
by: Mao, Zhuoyuan, et al.
Published: (2025)
WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning
by: Rao, Rajath, et al.
Published: (2025)
by: Rao, Rajath, et al.
Published: (2025)
Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations
by: Khaertdinov, Bulat, et al.
Published: (2024)
by: Khaertdinov, Bulat, et al.
Published: (2024)
Controlling Surprisal in Music Generation via Information Content Curve Matching
by: Bjare, Mathias Rose, et al.
Published: (2024)
by: Bjare, Mathias Rose, et al.
Published: (2024)
Aligning Language Models for Lyric-to-Melody Generation with Rule-Based Musical Constraints
by: Meng, Hao, et al.
Published: (2026)
by: Meng, Hao, et al.
Published: (2026)
PARCO: Phoneme-Augmented Robust Contextual ASR via Contrastive Entity Disambiguation
by: He, Jiajun, et al.
Published: (2025)
by: He, Jiajun, et al.
Published: (2025)
Similar Items
-
Missing Melodies: AI Music Generation and its "Nearly" Complete Omission of the Global South
by: Mehta, Atharva, et al.
Published: (2024) -
MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection
by: Lu, Tongyu, et al.
Published: (2025) -
Exploring Machine Learning and Language Models for Multimodal Depression Detection
by: Hong, Javier Si Zhao, et al.
Published: (2025) -
Melody-Guided Music Generation
by: Wei, Shaopeng, et al.
Published: (2024) -
SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition
by: Ding, Shuangrui, et al.
Published: (2024)