Guardado en:
Detalles Bibliográficos
Autores principales: Mahmoud, Tarek, Xie, Zhuohan, Dimitrov, Dimitar, Nikolaidis, Nikolaos, Silvano, Purificação, Yangarber, Roman, Sharma, Shivam, Sartori, Elisa, Stefanovitch, Nicolas, Martino, Giovanni Da San, Piskorski, Jakub, Nakov, Preslav
Formato: Preprint
Publicado: 2025
Materias:
Acceso en línea:https://arxiv.org/abs/2502.14718
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866915345035427840
author Mahmoud, Tarek
Xie, Zhuohan
Dimitrov, Dimitar
Nikolaidis, Nikolaos
Silvano, Purificação
Yangarber, Roman
Sharma, Shivam
Sartori, Elisa
Stefanovitch, Nicolas
Martino, Giovanni Da San
Piskorski, Jakub
Nakov, Preslav
author_facet Mahmoud, Tarek
Xie, Zhuohan
Dimitrov, Dimitar
Nikolaidis, Nikolaos
Silvano, Purificação
Yangarber, Roman
Sharma, Shivam
Sartori, Elisa
Stefanovitch, Nicolas
Martino, Giovanni Da San
Piskorski, Jakub
Nakov, Preslav
contents We introduce a novel multilingual hierarchical corpus annotated for entity framing and role portrayal in news articles. The dataset uses a unique taxonomy inspired by storytelling elements, comprising 22 fine-grained roles, or archetypes, nested within three main categories: protagonist, antagonist, and innocent. Each archetype is carefully defined, capturing nuanced portrayals of entities such as guardian, martyr, and underdog for protagonists; tyrant, deceiver, and bigot for antagonists; and victim, scapegoat, and exploited for innocents. The dataset includes 1,378 recent news articles in five languages (Bulgarian, English, Hindi, European Portuguese, and Russian) focusing on two critical domains of global significance: the Ukraine-Russia War and Climate Change. Over 5,800 entity mentions have been annotated with role labels. This dataset serves as a valuable resource for research into role portrayal and has broader implications for news analysis. We describe the characteristics of the dataset and the annotation process, and we report evaluation results on fine-tuned state-of-the-art multilingual transformers and hierarchical zero-shot learning using LLMs at the level of a document, a paragraph, and a sentence.
format Preprint
id arxiv_https___arxiv_org_abs_2502_14718
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Entity Framing and Role Portrayal in the News
Mahmoud, Tarek
Xie, Zhuohan
Dimitrov, Dimitar
Nikolaidis, Nikolaos
Silvano, Purificação
Yangarber, Roman
Sharma, Shivam
Sartori, Elisa
Stefanovitch, Nicolas
Martino, Giovanni Da San
Piskorski, Jakub
Nakov, Preslav
Computation and Language
We introduce a novel multilingual hierarchical corpus annotated for entity framing and role portrayal in news articles. The dataset uses a unique taxonomy inspired by storytelling elements, comprising 22 fine-grained roles, or archetypes, nested within three main categories: protagonist, antagonist, and innocent. Each archetype is carefully defined, capturing nuanced portrayals of entities such as guardian, martyr, and underdog for protagonists; tyrant, deceiver, and bigot for antagonists; and victim, scapegoat, and exploited for innocents. The dataset includes 1,378 recent news articles in five languages (Bulgarian, English, Hindi, European Portuguese, and Russian) focusing on two critical domains of global significance: the Ukraine-Russia War and Climate Change. Over 5,800 entity mentions have been annotated with role labels. This dataset serves as a valuable resource for research into role portrayal and has broader implications for news analysis. We describe the characteristics of the dataset and the annotation process, and we report evaluation results on fine-tuned state-of-the-art multilingual transformers and hierarchical zero-shot learning using LLMs at the level of a document, a paragraph, and a sentence.
title Entity Framing and Role Portrayal in the News
topic Computation and Language
url https://arxiv.org/abs/2502.14718