Saved in:
Bibliographic Details
Main Authors: Flyckt, Ricco Noel Hansen, Sjodsholm, Louise, Henriksen, Margrethe Høstgaard Bang, Brasen, Claus Lohman, Ebrahimi, Ali, Hilberg, Ole, Hansen, Torben Frøstrup, Wiil, Uffe Kock, Jensen, Lars Henrik, Peimankar, Abdolrahman
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.09596
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916126194139136
author Flyckt, Ricco Noel Hansen
Sjodsholm, Louise
Henriksen, Margrethe Høstgaard Bang
Brasen, Claus Lohman
Ebrahimi, Ali
Hilberg, Ole
Hansen, Torben Frøstrup
Wiil, Uffe Kock
Jensen, Lars Henrik
Peimankar, Abdolrahman
author_facet Flyckt, Ricco Noel Hansen
Sjodsholm, Louise
Henriksen, Margrethe Høstgaard Bang
Brasen, Claus Lohman
Ebrahimi, Ali
Hilberg, Ole
Hansen, Torben Frøstrup
Wiil, Uffe Kock
Jensen, Lars Henrik
Peimankar, Abdolrahman
contents Lung cancer (LC) remains the primary cause of cancer-related mortality, largely due to late-stage diagnoses. Effective strategies for early detection are therefore of paramount importance. In recent years, machine learning (ML) has demonstrated considerable potential in healthcare by facilitating the detection of various diseases. In this retrospective development and validation study, we developed an ML model based on dynamic ensemble selection (DES) for LC detection. The model leverages standard blood sample analysis and smoking history data from a large population at risk in Denmark. The study includes all patients examined on suspicion of LC in the Region of Southern Denmark from 2009 to 2018. We validated and compared the predictions by the DES model with diagnoses provided by five pulmonologists. Among the 38,944 patients, 9,940 had complete data of which 2,505 (25\%) had LC. The DES model achieved an area under the roc curve of 0.77$\pm$0.01, sensitivity of 76.2\%$\pm$2.4\%, specificity of 63.8\%$\pm$2.3\%, positive predictive value of 41.6\%$\pm$1.2\%, and F\textsubscript{1}-score of 53.8\%$\pm$1.1\%. The DES model outperformed all five pulmonologists, achieving a sensitivity 9\% higher than their average. The model identified smoking status, age, total calcium levels, neutrophil count, and lactate dehydrogenase as the most important factors for the detection of LC. The results highlight the successful application of the ML approach in detecting LC, surpassing pulmonologists' performance. Incorporating clinical and laboratory data in future risk assessment models can improve decision-making and facilitate timely referrals.
format Preprint
id arxiv_https___arxiv_org_abs_2402_09596
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Pulmonologists-Level lung cancer detection based on standard blood test results and smoking status using an explainable machine learning approach
Flyckt, Ricco Noel Hansen
Sjodsholm, Louise
Henriksen, Margrethe Høstgaard Bang
Brasen, Claus Lohman
Ebrahimi, Ali
Hilberg, Ole
Hansen, Torben Frøstrup
Wiil, Uffe Kock
Jensen, Lars Henrik
Peimankar, Abdolrahman
Machine Learning
Lung cancer (LC) remains the primary cause of cancer-related mortality, largely due to late-stage diagnoses. Effective strategies for early detection are therefore of paramount importance. In recent years, machine learning (ML) has demonstrated considerable potential in healthcare by facilitating the detection of various diseases. In this retrospective development and validation study, we developed an ML model based on dynamic ensemble selection (DES) for LC detection. The model leverages standard blood sample analysis and smoking history data from a large population at risk in Denmark. The study includes all patients examined on suspicion of LC in the Region of Southern Denmark from 2009 to 2018. We validated and compared the predictions by the DES model with diagnoses provided by five pulmonologists. Among the 38,944 patients, 9,940 had complete data of which 2,505 (25\%) had LC. The DES model achieved an area under the roc curve of 0.77$\pm$0.01, sensitivity of 76.2\%$\pm$2.4\%, specificity of 63.8\%$\pm$2.3\%, positive predictive value of 41.6\%$\pm$1.2\%, and F\textsubscript{1}-score of 53.8\%$\pm$1.1\%. The DES model outperformed all five pulmonologists, achieving a sensitivity 9\% higher than their average. The model identified smoking status, age, total calcium levels, neutrophil count, and lactate dehydrogenase as the most important factors for the detection of LC. The results highlight the successful application of the ML approach in detecting LC, surpassing pulmonologists' performance. Incorporating clinical and laboratory data in future risk assessment models can improve decision-making and facilitate timely referrals.
title Pulmonologists-Level lung cancer detection based on standard blood test results and smoking status using an explainable machine learning approach
topic Machine Learning
url https://arxiv.org/abs/2402.09596