MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Jinensibieke, Dawulie, Maimaiti, Mieradilijiang, Xiao, Wentao, Zheng, Yuanhang, Wang, Xiaobo
Natura:	Preprint
Pubblicazione:	2024
Soggetti:	Computation and Language
Accesso online:	https://arxiv.org/abs/2406.11162
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866929400133451776
author	Jinensibieke, Dawulie Maimaiti, Mieradilijiang Xiao, Wentao Zheng, Yuanhang Wang, Xiaobo
author_facet	Jinensibieke, Dawulie Maimaiti, Mieradilijiang Xiao, Wentao Zheng, Yuanhang Wang, Xiaobo
contents	Relation Extraction (RE) serves as a crucial technology for transforming unstructured text into structured information, especially within the framework of Knowledge Graph development. Its importance is emphasized by its essential role in various downstream tasks. Besides the conventional RE methods which are based on neural networks and pre-trained language models, large language models (LLMs) are also utilized in the research field of RE. However, on low-resource languages (LRLs), both conventional RE methods and LLM-based methods perform poorly on RE due to the data scarcity issues. To this end, this paper constructs low-resource relation extraction datasets in 10 LRLs in three regions (Central Asia, Southeast Asia and Middle East). The corpora are constructed by translating the original publicly available English RE datasets (NYT10, FewRel and CrossRE) using an effective multilingual machine translation. Then, we use the language perplexity (PPL) to filter out the low-quality data from the translated datasets. Finally, we conduct an empirical study and validate the performance of several open-source LLMs on these generated LRL RE datasets.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_11162
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	How Good are LLMs at Relation Extraction under Low-Resource Scenario? Comprehensive Evaluation Jinensibieke, Dawulie Maimaiti, Mieradilijiang Xiao, Wentao Zheng, Yuanhang Wang, Xiaobo Computation and Language Relation Extraction (RE) serves as a crucial technology for transforming unstructured text into structured information, especially within the framework of Knowledge Graph development. Its importance is emphasized by its essential role in various downstream tasks. Besides the conventional RE methods which are based on neural networks and pre-trained language models, large language models (LLMs) are also utilized in the research field of RE. However, on low-resource languages (LRLs), both conventional RE methods and LLM-based methods perform poorly on RE due to the data scarcity issues. To this end, this paper constructs low-resource relation extraction datasets in 10 LRLs in three regions (Central Asia, Southeast Asia and Middle East). The corpora are constructed by translating the original publicly available English RE datasets (NYT10, FewRel and CrossRE) using an effective multilingual machine translation. Then, we use the language perplexity (PPL) to filter out the low-quality data from the translated datasets. Finally, we conduct an empirical study and validate the performance of several open-source LLMs on these generated LRL RE datasets.
title	How Good are LLMs at Relation Extraction under Low-Resource Scenario? Comprehensive Evaluation
topic	Computation and Language
url	https://arxiv.org/abs/2406.11162

Documenti analoghi