Saved in:
Bibliographic Details
Main Authors: Tang, Guojun, Black, Jason E., Williamson, Tyler S., Drew, Steve H.
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2408.12029
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911999136366592
author Tang, Guojun
Black, Jason E.
Williamson, Tyler S.
Drew, Steve H.
author_facet Tang, Guojun
Black, Jason E.
Williamson, Tyler S.
Drew, Steve H.
contents Integrating Electronic Health Records (EHR) and the application of machine learning present opportunities for enhancing the accuracy and accessibility of data-driven diabetes prediction. In particular, developing data-driven machine learning models can provide early identification of patients with high risk for diabetes, potentially leading to more effective therapeutic strategies and reduced healthcare costs. However, regulation restrictions create barriers to developing centralized predictive models. This paper addresses the challenges by introducing a federated learning approach, which amalgamates predictive models without centralized data storage and processing, thus avoiding privacy issues. This marks the first application of federated learning to predict diabetes using real clinical datasets in Canada extracted from the Canadian Primary Care Sentinel Surveillance Network (CPCSSN) without crossprovince patient data sharing. We address class-imbalance issues through downsampling techniques and compare federated learning performance against province-based and centralized models. Experimental results show that the federated MLP model presents a similar or higher performance compared to the model trained with the centralized approach. However, the federated logistic regression model showed inferior performance compared to its centralized peer.
format Preprint
id arxiv_https___arxiv_org_abs_2408_12029
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Federated Diabetes Prediction in Canadian Adults Using Real-world Cross-Province Primary Care Data
Tang, Guojun
Black, Jason E.
Williamson, Tyler S.
Drew, Steve H.
Computational Engineering, Finance, and Science
Artificial Intelligence
Integrating Electronic Health Records (EHR) and the application of machine learning present opportunities for enhancing the accuracy and accessibility of data-driven diabetes prediction. In particular, developing data-driven machine learning models can provide early identification of patients with high risk for diabetes, potentially leading to more effective therapeutic strategies and reduced healthcare costs. However, regulation restrictions create barriers to developing centralized predictive models. This paper addresses the challenges by introducing a federated learning approach, which amalgamates predictive models without centralized data storage and processing, thus avoiding privacy issues. This marks the first application of federated learning to predict diabetes using real clinical datasets in Canada extracted from the Canadian Primary Care Sentinel Surveillance Network (CPCSSN) without crossprovince patient data sharing. We address class-imbalance issues through downsampling techniques and compare federated learning performance against province-based and centralized models. Experimental results show that the federated MLP model presents a similar or higher performance compared to the model trained with the centralized approach. However, the federated logistic regression model showed inferior performance compared to its centralized peer.
title Federated Diabetes Prediction in Canadian Adults Using Real-world Cross-Province Primary Care Data
topic Computational Engineering, Finance, and Science
Artificial Intelligence
url https://arxiv.org/abs/2408.12029