Saved in:
Bibliographic Details
Main Authors: Vuruma, Sai Krishna Revanth, Wu, Dezhi, Gupta, Saborny Sen, Aust, Lucas, Lookingbill, Valerie, Henry, Caleb, Ren, Yang, Kasson, Erin, Chen, Li-Shiun, Cavazos-Rehg, Patricia, Hu, Dian, Huang, Ming
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2404.17607
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911856714579968
author Vuruma, Sai Krishna Revanth
Wu, Dezhi
Gupta, Saborny Sen
Aust, Lucas
Lookingbill, Valerie
Henry, Caleb
Ren, Yang
Kasson, Erin
Chen, Li-Shiun
Cavazos-Rehg, Patricia
Hu, Dian
Huang, Ming
author_facet Vuruma, Sai Krishna Revanth
Wu, Dezhi
Gupta, Saborny Sen
Aust, Lucas
Lookingbill, Valerie
Henry, Caleb
Ren, Yang
Kasson, Erin
Chen, Li-Shiun
Cavazos-Rehg, Patricia
Hu, Dian
Huang, Ming
contents The widespread adoption of social media platforms globally not only enhances users' connectivity and communication but also emerges as a vital channel for the dissemination of health-related information, thereby establishing social media data as an invaluable organic data resource for public health research. The surge in popularity of vaping or e-cigarette use in the United States and other countries has caused an outbreak of e-cigarette and vaping use-associated lung injury (EVALI), leading to hospitalizations and fatalities in 2019, highlighting the urgency to comprehend vaping behaviors and develop effective strategies for cession. In this study, we extracted a sample dataset from one vaping sub-community on Reddit to analyze users' quit vaping intentions. Leveraging large language models including both the latest GPT-4 and traditional BERT-based language models for sentence-level quit-vaping intention prediction tasks, this study compares the outcomes of these models against human annotations. Notably, when compared to human evaluators, GPT-4 model demonstrates superior consistency in adhering to annotation guidelines and processes, showcasing advanced capabilities to detect nuanced user quit-vaping intentions that human evaluators might overlook. These preliminary findings emphasize the potential of GPT-4 in enhancing the accuracy and reliability of social media data analysis, especially in identifying subtle users' intentions that may elude human detection.
format Preprint
id arxiv_https___arxiv_org_abs_2404_17607
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Utilizing Large Language Models to Identify Reddit Users Considering Vaping Cessation for Digital Interventions
Vuruma, Sai Krishna Revanth
Wu, Dezhi
Gupta, Saborny Sen
Aust, Lucas
Lookingbill, Valerie
Henry, Caleb
Ren, Yang
Kasson, Erin
Chen, Li-Shiun
Cavazos-Rehg, Patricia
Hu, Dian
Huang, Ming
Information Retrieval
Artificial Intelligence
Computation and Language
Machine Learning
Social and Information Networks
The widespread adoption of social media platforms globally not only enhances users' connectivity and communication but also emerges as a vital channel for the dissemination of health-related information, thereby establishing social media data as an invaluable organic data resource for public health research. The surge in popularity of vaping or e-cigarette use in the United States and other countries has caused an outbreak of e-cigarette and vaping use-associated lung injury (EVALI), leading to hospitalizations and fatalities in 2019, highlighting the urgency to comprehend vaping behaviors and develop effective strategies for cession. In this study, we extracted a sample dataset from one vaping sub-community on Reddit to analyze users' quit vaping intentions. Leveraging large language models including both the latest GPT-4 and traditional BERT-based language models for sentence-level quit-vaping intention prediction tasks, this study compares the outcomes of these models against human annotations. Notably, when compared to human evaluators, GPT-4 model demonstrates superior consistency in adhering to annotation guidelines and processes, showcasing advanced capabilities to detect nuanced user quit-vaping intentions that human evaluators might overlook. These preliminary findings emphasize the potential of GPT-4 in enhancing the accuracy and reliability of social media data analysis, especially in identifying subtle users' intentions that may elude human detection.
title Utilizing Large Language Models to Identify Reddit Users Considering Vaping Cessation for Digital Interventions
topic Information Retrieval
Artificial Intelligence
Computation and Language
Machine Learning
Social and Information Networks
url https://arxiv.org/abs/2404.17607