Saved in:
Bibliographic Details
Main Authors: Pant, Devesh, Grandhe, Rishi Raj, Samaria, Vipin, Paul, Mukul, Kumar, Sudhir, Khanna, Saransh, Agrawal, Jatin, Kalra, Jushaan Singh, VSSG, Akhil, Khalikar, Satish V, Garg, Vipin, Chauhan, Himanshu, Verma, Pranay, Khandelwal, Neha, Dhavala, Soma S, Mathew, Minesh
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2506.19548
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915636001636352
author Pant, Devesh
Grandhe, Rishi Raj
Samaria, Vipin
Paul, Mukul
Kumar, Sudhir
Khanna, Saransh
Agrawal, Jatin
Kalra, Jushaan Singh
VSSG, Akhil
Khalikar, Satish V
Garg, Vipin
Chauhan, Himanshu
Verma, Pranay
Khandelwal, Neha
Dhavala, Soma S
Mathew, Minesh
author_facet Pant, Devesh
Grandhe, Rishi Raj
Samaria, Vipin
Paul, Mukul
Kumar, Sudhir
Khanna, Saransh
Agrawal, Jatin
Kalra, Jushaan Singh
VSSG, Akhil
Khalikar, Satish V
Garg, Vipin
Chauhan, Himanshu
Verma, Pranay
Khandelwal, Neha
Dhavala, Soma S
Mathew, Minesh
contents Early detection of disease outbreaks is crucial to ensure timely intervention by the health authorities. Due to the challenges associated with traditional indicator-based surveillance, monitoring informal sources such as online media has become increasingly popular. However, owing to the number of online articles getting published everyday, manual screening of the articles is impractical. To address this, we propose Health Sentinel. It is a multi-stage information extraction pipeline that uses a combination of ML and non-ML methods to extract events-structured information concerning disease outbreaks or other unusual health events-from online articles. The extracted events are made available to the Media Scanning and Verification Cell (MSVC) at the National Centre for Disease Control (NCDC), Delhi for analysis, interpretation and further dissemination to local agencies for timely intervention. From April 2022 till date, Health Sentinel has processed over 300 million news articles and identified over 95,000 unique health events across India of which over 3,500 events were shortlisted by the public health experts at NCDC as potential outbreaks.
format Preprint
id arxiv_https___arxiv_org_abs_2506_19548
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Health Sentinel: An AI Pipeline For Real-time Disease Outbreak Detection
Pant, Devesh
Grandhe, Rishi Raj
Samaria, Vipin
Paul, Mukul
Kumar, Sudhir
Khanna, Saransh
Agrawal, Jatin
Kalra, Jushaan Singh
VSSG, Akhil
Khalikar, Satish V
Garg, Vipin
Chauhan, Himanshu
Verma, Pranay
Khandelwal, Neha
Dhavala, Soma S
Mathew, Minesh
Computation and Language
Information Retrieval
Early detection of disease outbreaks is crucial to ensure timely intervention by the health authorities. Due to the challenges associated with traditional indicator-based surveillance, monitoring informal sources such as online media has become increasingly popular. However, owing to the number of online articles getting published everyday, manual screening of the articles is impractical. To address this, we propose Health Sentinel. It is a multi-stage information extraction pipeline that uses a combination of ML and non-ML methods to extract events-structured information concerning disease outbreaks or other unusual health events-from online articles. The extracted events are made available to the Media Scanning and Verification Cell (MSVC) at the National Centre for Disease Control (NCDC), Delhi for analysis, interpretation and further dissemination to local agencies for timely intervention. From April 2022 till date, Health Sentinel has processed over 300 million news articles and identified over 95,000 unique health events across India of which over 3,500 events were shortlisted by the public health experts at NCDC as potential outbreaks.
title Health Sentinel: An AI Pipeline For Real-time Disease Outbreak Detection
topic Computation and Language
Information Retrieval
url https://arxiv.org/abs/2506.19548