:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Ouyang, Zhihao, Wang, Ju-Chiang, Zhang, Daiyu, Chen, Bin, Li, Shangjie, Lin, Quan
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Sound
Online-Zugang:	https://arxiv.org/abs/2508.19514
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Jamendo-QA: A Large-Scale Music Question Answering Dataset
von: Koh, Junyoung, et al.
Veröffentlicht: (2025)

Learning Musical Representations for Music Performance Question Answering
von: Diao, Xingjian, et al.
Veröffentlicht: (2025)

Temporal Adaptation of Pre-trained Foundation Models for Music Structure Analysis
von: Zhang, Yixiao, et al.
Veröffentlicht: (2025)

Persian MusicGen: A Large-Scale Dataset and Culturally-Aware Generative Model for Persian Music
von: Sameti, Mohammad Hossein, et al.
Veröffentlicht: (2026)

The Music Maestro or The Musically Challenged, A Massive Music Evaluation Benchmark for Large Language Models
von: Li, Jiajia, et al.
Veröffentlicht: (2024)

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training
von: Li, Yizhi, et al.
Veröffentlicht: (2023)

NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms
von: Wang, Yashan, et al.
Veröffentlicht: (2025)

InspireMusic: Integrating Super Resolution and Large Language Model for High-Fidelity Long-Form Music Generation
von: Zhang, Chong, et al.
Veröffentlicht: (2025)

Large-Scale Training Data Attribution for Music Generative Models via Unlearning
von: Choi, Woosung, et al.
Veröffentlicht: (2025)

Heterogeneity-Aware Dataset Scheduling for Efficient Audio Large Language Model Training
von: Wu, Yanru, et al.
Veröffentlicht: (2026)

Content-based Controls For Music Large Language Modeling
von: Lin, Liwei, et al.
Veröffentlicht: (2023)

Enhancing Temporal Understanding in Audio Question Answering for Large Audio Language Models
von: Sridhar, Arvind Krishna, et al.
Veröffentlicht: (2024)

JamendoMaxCaps: A Large Scale Music-caption Dataset with Imputed Metadata
von: Roy, Abhinaba, et al.
Veröffentlicht: (2025)

Music Audio-Visual Question Answering Requires Specialized Multimodal Designs
von: You, Wenhao, et al.
Veröffentlicht: (2025)

Musical Score Understanding Benchmark: Evaluating Large Language Models' Comprehension of Complete Musical Scores
von: Dai, Congren, et al.
Veröffentlicht: (2025)

Assessing Factual Music Comprehension in Large Audio Language Models
von: Lin, Daniel Chenyu, et al.
Veröffentlicht: (2025)

MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations
von: Guo, Wenxiang, et al.
Veröffentlicht: (2025)

PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing
von: Long, Phillip, et al.
Veröffentlicht: (2024)

Jamendo-MT-QA: A Benchmark for Multi-Track Comparative Music Question Answering
von: Koh, Junyoung, et al.
Veröffentlicht: (2026)

PianoBART: Symbolic Piano Music Generation and Understanding with Large-Scale Pre-Training
von: Liang, Xiao, et al.
Veröffentlicht: (2024)

When Noise Lowers The Loss: Rethinking Likelihood-Based Evaluation in Music Large Language Models
von: Li, Xiaosha, et al.
Veröffentlicht: (2026)

Large Language Models: From Notes to Musical Form
von: Atassi, Lilac
Veröffentlicht: (2024)

Learning Sparsity for Effective and Efficient Music Performance Question Answering
von: Diao, Xingjian, et al.
Veröffentlicht: (2025)

End-to-end Contrastive Language-Speech Pretraining Model For Long-form Spoken Question Answering
von: Hu, Jiliang, et al.
Veröffentlicht: (2025)

ABC-Eval: Benchmarking Large Language Models on Symbolic Music Understanding and Instruction Following
von: Zhao, Jiahao, et al.
Veröffentlicht: (2025)

MusicScore: A Dataset for Music Score Modeling and Generation
von: Lin, Yuheng, et al.
Veröffentlicht: (2024)

Musical ethnocentrism in Large Language Models
von: Kruspe, Anna
Veröffentlicht: (2025)

Layer-wise Investigation of Large-Scale Self-Supervised Music Representation Models
von: Zhou, Yizhi, et al.
Veröffentlicht: (2025)

SymPAC: Scalable Symbolic Music Generation With Prompts And Constraints
von: Chen, Haonan, et al.
Veröffentlicht: (2024)

ShiftySpeech: A Large-Scale Synthetic Speech Dataset with Distribution Shifts
von: Garg, Ashi, et al.
Veröffentlicht: (2025)

Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models
von: Li, Jiajun, et al.
Veröffentlicht: (2024)

TALKPLAY: Multimodal Music Recommendation with Large Language Models
von: Doh, Seungheon, et al.
Veröffentlicht: (2025)

Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation
von: He, Haorui, et al.
Veröffentlicht: (2025)

AQUALLM: Audio Question Answering Data Generation Using Large Language Models
von: Behera, Swarup Ranjan, et al.
Veröffentlicht: (2023)

Tadabur: A Large-Scale Quran Audio Dataset
von: Alherran, Faisal
Veröffentlicht: (2026)

CodecFake+: A Large-Scale Neural Audio Codec-Based Deepfake Speech Dataset
von: Chen, Xuanjun, et al.
Veröffentlicht: (2025)

Score-Agnostic Structure Analysis in Large-Scale Performance Datasets
von: Hu, Patricia, et al.
Veröffentlicht: (2026)

How Contrastive Decoding Enhances Large Audio Language Models?
von: Lin, Tzu-Quan, et al.
Veröffentlicht: (2026)

Scaling Audio-Text Retrieval with Multimodal Large Language Models
von: Xu, Jilan, et al.
Veröffentlicht: (2026)

ChronosAudio: A Comprehensive Long-Audio Benchmark for Evaluating Audio-Large Language Models
von: Luo, Kaiwen, et al.
Veröffentlicht: (2026)