Saved in:
| Main Authors: | Weck, Benno, Puentes, Pablo, Poltronieri, Andrea, Prabhu, Satyajeet, Bogdanov, Dmitry |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.27877 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models
by: Weck, Benno, et al.
Published: (2024)
by: Weck, Benno, et al.
Published: (2024)
The language of sound search: Examining User Queries in Audio Search Engines
by: Weck, Benno, et al.
Published: (2024)
by: Weck, Benno, et al.
Published: (2024)
Revisiting Meter Tracking in Carnatic Music using Deep Learning Approaches
by: Prabhu, Satyajeet
Published: (2025)
by: Prabhu, Satyajeet
Published: (2025)
WikiMuTe: A web-sourced dataset of semantic descriptions for music audio
by: Weck, Benno, et al.
Published: (2023)
by: Weck, Benno, et al.
Published: (2023)
OMAR-RQ: Open Music Audio Representation Model Trained with Multi-Feature Masked Token Prediction
by: Alonso-Jiménez, Pablo, et al.
Published: (2025)
by: Alonso-Jiménez, Pablo, et al.
Published: (2025)
The Role of Large Language Models in Musicology: Are We Ready to Trust the Machines?
by: Ramoneda, Pedro, et al.
Published: (2024)
by: Ramoneda, Pedro, et al.
Published: (2024)
The ICASSP 2026 HumDial Challenge: Benchmarking Human-like Spoken Dialogue Systems in the LLM Era
by: Zhao, Zhixian, et al.
Published: (2026)
by: Zhao, Zhixian, et al.
Published: (2026)
Voices of Civilizations: A Multilingual QA Benchmark for Global Music Understanding
by: Wu, Shangda, et al.
Published: (2026)
by: Wu, Shangda, et al.
Published: (2026)
Analyzing Error Propagation in Korean Spoken QA with ASR-LLM Cascades
by: Jung, Donghyuk, et al.
Published: (2026)
by: Jung, Donghyuk, et al.
Published: (2026)
Benchmarking Music Autotagging with MGPHot Expert Annotations vs. Generic Tag Datasets
by: Ramoneda, Pedro, et al.
Published: (2025)
by: Ramoneda, Pedro, et al.
Published: (2025)
Are you really listening? Boosting Perceptual Awareness in Music-QA Benchmarks
by: Zang, Yongyi, et al.
Published: (2025)
by: Zang, Yongyi, et al.
Published: (2025)
Persian MusicGen: A Large-Scale Dataset and Culturally-Aware Generative Model for Persian Music
by: Sameti, Mohammad Hossein, et al.
Published: (2026)
by: Sameti, Mohammad Hossein, et al.
Published: (2026)
BASS: Benchmarking Audio LMs for Musical Structure and Semantic Reasoning
by: Jang, Min, et al.
Published: (2026)
by: Jang, Min, et al.
Published: (2026)
ASR Benchmarking: Need for a More Representative Conversational Dataset
by: Maheshwari, Gaurav, et al.
Published: (2024)
by: Maheshwari, Gaurav, et al.
Published: (2024)
Jamendo-QA: A Large-Scale Music Question Answering Dataset
by: Koh, Junyoung, et al.
Published: (2025)
by: Koh, Junyoung, et al.
Published: (2025)
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
by: Ao, Junyi, et al.
Published: (2024)
by: Ao, Junyi, et al.
Published: (2024)
Discogs-VI: A Musical Version Identification Dataset Based on Public Editorial Metadata
by: Araz, R. Oguz, et al.
Published: (2024)
by: Araz, R. Oguz, et al.
Published: (2024)
Jamendo-MT-QA: A Benchmark for Multi-Track Comparative Music Question Answering
by: Koh, Junyoung, et al.
Published: (2026)
by: Koh, Junyoung, et al.
Published: (2026)
Can LLMs "Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation
by: Zhou, Ziya, et al.
Published: (2024)
by: Zhou, Ziya, et al.
Published: (2024)
CrossMuSim: A Cross-Modal Framework for Music Similarity Retrieval with LLM-Powered Text Description Sourcing and Mining
by: Tsoi, Tristan, et al.
Published: (2025)
by: Tsoi, Tristan, et al.
Published: (2025)
Words at Play: Benchmarking Audio Pun Understanding in Large Audio-Language Models
by: Su, Yuchen, et al.
Published: (2026)
by: Su, Yuchen, et al.
Published: (2026)
Do Music Preferences Reflect Cultural Values? A Cross-National Analysis Using Music Embedding and World Values Survey
by: Kim, Yongjae, et al.
Published: (2025)
by: Kim, Yongjae, et al.
Published: (2025)
WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning
by: Mundada, Gagan, et al.
Published: (2025)
by: Mundada, Gagan, et al.
Published: (2025)
RJUA-QA: A Comprehensive QA Dataset for Urology
by: Lyu, Shiwei, et al.
Published: (2023)
by: Lyu, Shiwei, et al.
Published: (2023)
MobQA: A Benchmark Dataset for Semantic Understanding of Human Mobility Data through Question Answering
by: Asano, Hikaru, et al.
Published: (2025)
by: Asano, Hikaru, et al.
Published: (2025)
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
by: Ma, Ziyang, et al.
Published: (2025)
by: Ma, Ziyang, et al.
Published: (2025)
Persian Musical Instruments Classification Using Polyphonic Data Augmentation
by: Esfangereh, Diba Hadi, et al.
Published: (2025)
by: Esfangereh, Diba Hadi, et al.
Published: (2025)
Linear Complexity Self-Supervised Learning for Music Understanding with Random Quantizer
by: Vavaroutsos, Petros, et al.
Published: (2026)
by: Vavaroutsos, Petros, et al.
Published: (2026)
Rubato: Transcribing Piano Music with Timestamps
by: Tamer, Nazif Can, et al.
Published: (2026)
by: Tamer, Nazif Can, et al.
Published: (2026)
SongSong: A Time Phonograph for Chinese SongCi Music from Thousand of Years Away
by: Li, Jiajia, et al.
Published: (2026)
by: Li, Jiajia, et al.
Published: (2026)
CSyMR: Benchmarking Compositional Music Information Retrieval in Symbolic Music Reasoning
by: Wang, Boyang, et al.
Published: (2025)
by: Wang, Boyang, et al.
Published: (2025)
Advancing Singlish Understanding: Bridging the Gap with Datasets and Multimodal Models
by: Wang, Bin, et al.
Published: (2025)
by: Wang, Bin, et al.
Published: (2025)
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE
by: Liu, Zhenyu, et al.
Published: (2025)
by: Liu, Zhenyu, et al.
Published: (2025)
DeepResonance: Enhancing Multimodal Music Understanding via Music-centric Multi-way Instruction Tuning
by: Mao, Zhuoyuan, et al.
Published: (2025)
by: Mao, Zhuoyuan, et al.
Published: (2025)
MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark
by: Wang, Dingdong, et al.
Published: (2025)
by: Wang, Dingdong, et al.
Published: (2025)
MusicAIR: A Multimodal AI Music Generation Framework Powered by an Algorithm-Driven Core
by: Liao, Callie C., et al.
Published: (2025)
by: Liao, Callie C., et al.
Published: (2025)
OpenMU: Your Swiss Army Knife for Music Understanding
by: Zhao, Mengjie, et al.
Published: (2024)
by: Zhao, Mengjie, et al.
Published: (2024)
SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio Detection
by: Yi, Jiangyan, et al.
Published: (2022)
by: Yi, Jiangyan, et al.
Published: (2022)
BRACE: A Benchmark for Robust Audio Caption Quality Evaluation
by: Guo, Tianyu, et al.
Published: (2025)
by: Guo, Tianyu, et al.
Published: (2025)
MIDI-LLM: Adapting Large Language Models for Text-to-MIDI Music Generation
by: Wu, Shih-Lun, et al.
Published: (2025)
by: Wu, Shih-Lun, et al.
Published: (2025)
Similar Items
-
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models
by: Weck, Benno, et al.
Published: (2024) -
The language of sound search: Examining User Queries in Audio Search Engines
by: Weck, Benno, et al.
Published: (2024) -
Revisiting Meter Tracking in Carnatic Music using Deep Learning Approaches
by: Prabhu, Satyajeet
Published: (2025) -
WikiMuTe: A web-sourced dataset of semantic descriptions for music audio
by: Weck, Benno, et al.
Published: (2023) -
OMAR-RQ: Open Music Audio Representation Model Trained with Multi-Feature Masked Token Prediction
by: Alonso-Jiménez, Pablo, et al.
Published: (2025)