Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Ren, Yi, Zhang, Tianyi, Li, Weibin, Zhou, DuoMu, Qin, Chenhao, Dong, FangCheng
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2409.18548
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912048011542528
author	Ren, Yi Zhang, Tianyi Li, Weibin Zhou, DuoMu Qin, Chenhao Dong, FangCheng
author_facet	Ren, Yi Zhang, Tianyi Li, Weibin Zhou, DuoMu Qin, Chenhao Dong, FangCheng
contents	In recent years, with the rapid development of large language models, serval models such as GPT-4o have demonstrated extraordinary capabilities, surpassing human performance in various language tasks. As a result, many researchers have begun exploring their potential applications in the field of public opinion analysis. This study proposes a novel large-language-models-based method for public opinion event heat level prediction. First, we preprocessed and classified 62,836 Chinese hot event data collected between July 2022 and December 2023. Then, based on each event's online dissemination heat index, we used the MiniBatchKMeans algorithm to automatically cluster the events and categorize them into four heat levels (ranging from low heat to very high heat). Next, we randomly selected 250 events from each heat level, totalling 1,000 events, to build the evaluation dataset. During the evaluation process, we employed various large language models to assess their accuracy in predicting event heat levels in two scenarios: without reference cases and with similar case references. The results showed that GPT-4o and DeepseekV2 performed the best in the latter case, achieving prediction accuracies of 41.4% and 41.5%, respectively. Although the overall prediction accuracy remains relatively low, it is worth noting that for low-heat (Level 1) events, the prediction accuracies of these two models reached 73.6% and 70.4%, respectively. Additionally, the prediction accuracy showed a downward trend from Level 1 to Level 4, which correlates with the uneven distribution of data across the heat levels in the actual dataset. This suggests that with the more robust dataset, public opinion event heat level prediction based on large language models will have significant research potential for the future.
format	Preprint
id	arxiv_https___arxiv_org_abs_2409_18548
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Research on Predicting Public Opinion Event Heat Levels Based on Large Language Models Ren, Yi Zhang, Tianyi Li, Weibin Zhou, DuoMu Qin, Chenhao Dong, FangCheng Computation and Language Artificial Intelligence In recent years, with the rapid development of large language models, serval models such as GPT-4o have demonstrated extraordinary capabilities, surpassing human performance in various language tasks. As a result, many researchers have begun exploring their potential applications in the field of public opinion analysis. This study proposes a novel large-language-models-based method for public opinion event heat level prediction. First, we preprocessed and classified 62,836 Chinese hot event data collected between July 2022 and December 2023. Then, based on each event's online dissemination heat index, we used the MiniBatchKMeans algorithm to automatically cluster the events and categorize them into four heat levels (ranging from low heat to very high heat). Next, we randomly selected 250 events from each heat level, totalling 1,000 events, to build the evaluation dataset. During the evaluation process, we employed various large language models to assess their accuracy in predicting event heat levels in two scenarios: without reference cases and with similar case references. The results showed that GPT-4o and DeepseekV2 performed the best in the latter case, achieving prediction accuracies of 41.4% and 41.5%, respectively. Although the overall prediction accuracy remains relatively low, it is worth noting that for low-heat (Level 1) events, the prediction accuracies of these two models reached 73.6% and 70.4%, respectively. Additionally, the prediction accuracy showed a downward trend from Level 1 to Level 4, which correlates with the uneven distribution of data across the heat levels in the actual dataset. This suggests that with the more robust dataset, public opinion event heat level prediction based on large language models will have significant research potential for the future.
title	Research on Predicting Public Opinion Event Heat Levels Based on Large Language Models
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2409.18548

Similar Items