Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Lee, Harin, Çelen, Elif, Harrison, Peter, Anglada-Tort, Manuel, van Rijn, Pol, Park, Minsu, Schönwiesner, Marc, Jacoby, Nori
Format: Preprint
Veröffentlicht: 2025
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2505.09539
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866914117144543232
author Lee, Harin
Çelen, Elif
Harrison, Peter
Anglada-Tort, Manuel
van Rijn, Pol
Park, Minsu
Schönwiesner, Marc
Jacoby, Nori
author_facet Lee, Harin
Çelen, Elif
Harrison, Peter
Anglada-Tort, Manuel
van Rijn, Pol
Park, Minsu
Schönwiesner, Marc
Jacoby, Nori
contents Human annotations of mood in music are essential for music generation and recommender systems. However, existing datasets predominantly focus on Western songs with terms derived from English, which may limit generalizability across diverse linguistic and cultural backgrounds. We introduce 'GlobalMood', a novel cross-cultural benchmark dataset comprising 1,180 songs sampled from 59 countries, with large-scale annotations collected from 2,519 individuals across five culturally and linguistically distinct locations: U.S., France, Mexico, S. Korea, and Egypt. Rather than imposing predefined emotion and mood categories, we implement a bottom-up, participant-driven approach to organically elicit culturally specific music-related emotion terms. We then recruit another pool of human participants to collect 988,925 ratings for these culture-specific descriptors. Our analysis confirms the presence of a valence-arousal structure shared across cultures, yet also reveals significant divergences in how certain emotion terms (despite being dictionary equivalents) are perceived cross-culturally. State-of-the-art multimodal models benefit substantially from fine-tuning on our cross-culturally balanced dataset, particularly in non-English contexts. Broadly, our findings inform the ongoing debate on the universality versus cultural specificity of emotional descriptors, and our methodology can contribute to other multimodal and cross-lingual research.
format Preprint
id arxiv_https___arxiv_org_abs_2505_09539
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle GlobalMood: A cross-cultural benchmark for music emotion recognition
Lee, Harin
Çelen, Elif
Harrison, Peter
Anglada-Tort, Manuel
van Rijn, Pol
Park, Minsu
Schönwiesner, Marc
Jacoby, Nori
Information Retrieval
Human annotations of mood in music are essential for music generation and recommender systems. However, existing datasets predominantly focus on Western songs with terms derived from English, which may limit generalizability across diverse linguistic and cultural backgrounds. We introduce 'GlobalMood', a novel cross-cultural benchmark dataset comprising 1,180 songs sampled from 59 countries, with large-scale annotations collected from 2,519 individuals across five culturally and linguistically distinct locations: U.S., France, Mexico, S. Korea, and Egypt. Rather than imposing predefined emotion and mood categories, we implement a bottom-up, participant-driven approach to organically elicit culturally specific music-related emotion terms. We then recruit another pool of human participants to collect 988,925 ratings for these culture-specific descriptors. Our analysis confirms the presence of a valence-arousal structure shared across cultures, yet also reveals significant divergences in how certain emotion terms (despite being dictionary equivalents) are perceived cross-culturally. State-of-the-art multimodal models benefit substantially from fine-tuning on our cross-culturally balanced dataset, particularly in non-English contexts. Broadly, our findings inform the ongoing debate on the universality versus cultural specificity of emotional descriptors, and our methodology can contribute to other multimodal and cross-lingual research.
title GlobalMood: A cross-cultural benchmark for music emotion recognition
topic Information Retrieval
url https://arxiv.org/abs/2505.09539