Saved in:
Bibliographic Details
Main Authors: Goncharok, Dvora, Shifman, Arbel, Apartsin, Alexander, Aperstein, Yehudit
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.11802
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912587601412096
author Goncharok, Dvora
Shifman, Arbel
Apartsin, Alexander
Aperstein, Yehudit
author_facet Goncharok, Dvora
Shifman, Arbel
Apartsin, Alexander
Aperstein, Yehudit
contents Online medical forums are a rich and underutilized source of insight into patient concerns, especially regarding medication use. Some of the many questions users pose may signal confusion, misuse, or even the early warning signs of a developing health crisis. Detecting these critical questions that may precede severe adverse events or life-threatening complications is vital for timely intervention and improving patient safety. This study introduces a novel annotated dataset of medication-related questions extracted from online forums. Each entry is manually labelled for criticality based on clinical risk factors. We benchmark the performance of six traditional machine learning classifiers using TF-IDF textual representations, alongside three state-of-the-art large language model (LLM)-based classification approaches that leverage deep contextual understanding. Our results highlight the potential of classical and modern methods to support real-time triage and alert systems in digital health spaces. The curated dataset is made publicly available to encourage further research at the intersection of patient-generated data, natural language processing, and early warning systems for critical health events. The dataset and benchmark are available at: https://github.com/Dvora-coder/LLM-Medication-QA-Risk-Classifier-MediGuard.
format Preprint
id arxiv_https___arxiv_org_abs_2509_11802
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle When Curiosity Signals Danger: Predicting Health Crises Through Online Medication Inquiries
Goncharok, Dvora
Shifman, Arbel
Apartsin, Alexander
Aperstein, Yehudit
Computation and Language
Online medical forums are a rich and underutilized source of insight into patient concerns, especially regarding medication use. Some of the many questions users pose may signal confusion, misuse, or even the early warning signs of a developing health crisis. Detecting these critical questions that may precede severe adverse events or life-threatening complications is vital for timely intervention and improving patient safety. This study introduces a novel annotated dataset of medication-related questions extracted from online forums. Each entry is manually labelled for criticality based on clinical risk factors. We benchmark the performance of six traditional machine learning classifiers using TF-IDF textual representations, alongside three state-of-the-art large language model (LLM)-based classification approaches that leverage deep contextual understanding. Our results highlight the potential of classical and modern methods to support real-time triage and alert systems in digital health spaces. The curated dataset is made publicly available to encourage further research at the intersection of patient-generated data, natural language processing, and early warning systems for critical health events. The dataset and benchmark are available at: https://github.com/Dvora-coder/LLM-Medication-QA-Risk-Classifier-MediGuard.
title When Curiosity Signals Danger: Predicting Health Crises Through Online Medication Inquiries
topic Computation and Language
url https://arxiv.org/abs/2509.11802