Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhong, Qihuang, Li, Haiyun, Zhuang, Luyao, Liu, Juhua, Du, Bo
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2407.00341
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929521517658112
author	Zhong, Qihuang Li, Haiyun Zhuang, Luyao Liu, Juhua Du, Bo
author_facet	Zhong, Qihuang Li, Haiyun Zhuang, Luyao Liu, Juhua Du, Bo
contents	Aspect-based Sentiment Analysis (ABSA) is an important sentiment analysis task, which aims to determine the sentiment polarity towards an aspect in a sentence. Due to the expensive and limited labeled data, data generation (DG) has become the standard for improving the performance of ABSA. However, current DG methods usually have some shortcomings: 1) poor fluency and coherence, 2) lack of diversity of generated data, and 3) reliance on some existing labeled data, hindering its applications in real-world scenarios. With the advancement of large language models (LLMs), LLM-based DG has the potential to solve the above issues. Unfortunately, directly prompting LLMs struggles to generate the desired pseudo-label ABSA data, as LLMs are prone to hallucinations, leading to undesired data generation. To this end, we propose a systematic Iterative Data Generation framework, namely IDG, to boost the performance of ABSA. The core of IDG is to make full use of the powerful abilities (i.e., instruction-following, in-context learning and self-reflection) of LLMs to iteratively generate more fluent and diverse pseudo-label data, starting from an unsupervised sentence corpus. Specifically, IDG designs a novel iterative data generation mechanism and a self-reflection data filtering module to tackle the challenges of unexpected data generation caused by hallucinations. Extensive experiments on four widely-used ABSA benchmarks show that IDG brings consistent and significant performance gains among five baseline ABSA models. More encouragingly, the synthetic data generated by IDG can achieve comparable or even better performance against the manually annotated data.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_00341
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Iterative Data Generation with Large Language Models for Aspect-based Sentiment Analysis Zhong, Qihuang Li, Haiyun Zhuang, Luyao Liu, Juhua Du, Bo Computation and Language Aspect-based Sentiment Analysis (ABSA) is an important sentiment analysis task, which aims to determine the sentiment polarity towards an aspect in a sentence. Due to the expensive and limited labeled data, data generation (DG) has become the standard for improving the performance of ABSA. However, current DG methods usually have some shortcomings: 1) poor fluency and coherence, 2) lack of diversity of generated data, and 3) reliance on some existing labeled data, hindering its applications in real-world scenarios. With the advancement of large language models (LLMs), LLM-based DG has the potential to solve the above issues. Unfortunately, directly prompting LLMs struggles to generate the desired pseudo-label ABSA data, as LLMs are prone to hallucinations, leading to undesired data generation. To this end, we propose a systematic Iterative Data Generation framework, namely IDG, to boost the performance of ABSA. The core of IDG is to make full use of the powerful abilities (i.e., instruction-following, in-context learning and self-reflection) of LLMs to iteratively generate more fluent and diverse pseudo-label data, starting from an unsupervised sentence corpus. Specifically, IDG designs a novel iterative data generation mechanism and a self-reflection data filtering module to tackle the challenges of unexpected data generation caused by hallucinations. Extensive experiments on four widely-used ABSA benchmarks show that IDG brings consistent and significant performance gains among five baseline ABSA models. More encouragingly, the synthetic data generated by IDG can achieve comparable or even better performance against the manually annotated data.
title	Iterative Data Generation with Large Language Models for Aspect-based Sentiment Analysis
topic	Computation and Language
url	https://arxiv.org/abs/2407.00341

Similar Items