Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Buckley, Brian, O'Hagan, Adrian, Galligan, Marie
Format:	Preprint
Published:	2023
Subjects:	Applications
Online Access:	https://arxiv.org/abs/2304.03733
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912004008050688
author	Buckley, Brian O'Hagan, Adrian Galligan, Marie
author_facet	Buckley, Brian O'Hagan, Adrian Galligan, Marie
contents	We investigate the performance and characteristics of currently available VB and MCMC software to explore the practicability of available approaches and provide guidance for clinical practitioners. Two case studies are used to fully explore the methods covering a variety of real-world data. First, we use the publicly available Pima Indian diabetes data to comprehensively compare VB implementations of logistic regression. Second, a large real-world data set, Optum(TM) EHR with approximately one million diabetes patients extended the analysis to large, highly unbalanced data containing discrete and continuous variables. A Bayesian patient phenotyping composite model incorporating latent class analysis (LCA) and regression was implemented with the second case study. We find that several data characteristics common in clinical data, such as sparsity, significantly affect the posterior accuracy of automatic VB methods compared with conditionally conjugate mean-field methods. We find that for both models, automatic VB approaches require more effort and technical knowledge to set up for accurate posterior estimation and are very sensitive to stopping time compared with closed-form VB methods. Our results indicate that the patient phenotyping composite Bayes model is more easily usable for real-world studies if Monte Carlo is replaced with VB. It can potentially become a uniquely useful tool for decision support, especially for rare diseases where gold-standard biomarker data is sparse but prior knowledge can be used to assist model diagnosis and may suggest when biomarker tests are warranted.
format	Preprint
id	arxiv_https___arxiv_org_abs_2304_03733
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Variational Bayes latent class approach for EHR-based phenotyping with large real-world data Buckley, Brian O'Hagan, Adrian Galligan, Marie Applications We investigate the performance and characteristics of currently available VB and MCMC software to explore the practicability of available approaches and provide guidance for clinical practitioners. Two case studies are used to fully explore the methods covering a variety of real-world data. First, we use the publicly available Pima Indian diabetes data to comprehensively compare VB implementations of logistic regression. Second, a large real-world data set, Optum(TM) EHR with approximately one million diabetes patients extended the analysis to large, highly unbalanced data containing discrete and continuous variables. A Bayesian patient phenotyping composite model incorporating latent class analysis (LCA) and regression was implemented with the second case study. We find that several data characteristics common in clinical data, such as sparsity, significantly affect the posterior accuracy of automatic VB methods compared with conditionally conjugate mean-field methods. We find that for both models, automatic VB approaches require more effort and technical knowledge to set up for accurate posterior estimation and are very sensitive to stopping time compared with closed-form VB methods. Our results indicate that the patient phenotyping composite Bayes model is more easily usable for real-world studies if Monte Carlo is replaced with VB. It can potentially become a uniquely useful tool for decision support, especially for rare diseases where gold-standard biomarker data is sparse but prior knowledge can be used to assist model diagnosis and may suggest when biomarker tests are warranted.
title	Variational Bayes latent class approach for EHR-based phenotyping with large real-world data
topic	Applications
url	https://arxiv.org/abs/2304.03733

Similar Items