Saved in:
Bibliographic Details
Main Authors: Rao, Shishir, Mamouei, Mohammad, Salimi-Khorshidi, Gholamreza, Li, Yikuan, Ramakrishnan, Rema, Hassaine, Abdelaali, Canoy, Dexter, Rahimi, Kazem
Format: Preprint
Published: 2022
Subjects:
Online Access:https://arxiv.org/abs/2202.03487
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912086309732352
author Rao, Shishir
Mamouei, Mohammad
Salimi-Khorshidi, Gholamreza
Li, Yikuan
Ramakrishnan, Rema
Hassaine, Abdelaali
Canoy, Dexter
Rahimi, Kazem
author_facet Rao, Shishir
Mamouei, Mohammad
Salimi-Khorshidi, Gholamreza
Li, Yikuan
Ramakrishnan, Rema
Hassaine, Abdelaali
Canoy, Dexter
Rahimi, Kazem
contents Observational causal inference is useful for decision making in medicine when randomized clinical trials (RCT) are infeasible or non generalizable. However, traditional approaches fail to deliver unconfounded causal conclusions in practice. The rise of "doubly robust" non-parametric tools coupled with the growth of deep learning for capturing rich representations of multimodal data, offers a unique opportunity to develop and test such models for causal inference on comprehensive electronic health records (EHR). In this paper, we investigate causal modelling of an RCT-established null causal association: the effect of antihypertensive use on incident cancer risk. We develop a dataset for our observational study and a Transformer-based model, Targeted BEHRT coupled with doubly robust estimation, we estimate average risk ratio (RR). We compare our model to benchmark statistical and deep learning models for causal inference in multiple experiments on semi-synthetic derivations of our dataset with various types and intensities of confounding. In order to further test the reliability of our approach, we test our model on situations of limited data. We find that our model provides more accurate estimates of RR (least sum absolute error from ground truth) compared to benchmarks for risk ratio estimation on high-dimensional EHR across experiments. Finally, we apply our model to investigate the original case study: antihypertensives' effect on cancer and demonstrate that our model generally captures the validated null association.
format Preprint
id arxiv_https___arxiv_org_abs_2202_03487
institution arXiv
publishDate 2022
record_format arxiv
spellingShingle Targeted-BEHRT: Deep learning for observational causal inference on longitudinal electronic health records
Rao, Shishir
Mamouei, Mohammad
Salimi-Khorshidi, Gholamreza
Li, Yikuan
Ramakrishnan, Rema
Hassaine, Abdelaali
Canoy, Dexter
Rahimi, Kazem
Machine Learning
Observational causal inference is useful for decision making in medicine when randomized clinical trials (RCT) are infeasible or non generalizable. However, traditional approaches fail to deliver unconfounded causal conclusions in practice. The rise of "doubly robust" non-parametric tools coupled with the growth of deep learning for capturing rich representations of multimodal data, offers a unique opportunity to develop and test such models for causal inference on comprehensive electronic health records (EHR). In this paper, we investigate causal modelling of an RCT-established null causal association: the effect of antihypertensive use on incident cancer risk. We develop a dataset for our observational study and a Transformer-based model, Targeted BEHRT coupled with doubly robust estimation, we estimate average risk ratio (RR). We compare our model to benchmark statistical and deep learning models for causal inference in multiple experiments on semi-synthetic derivations of our dataset with various types and intensities of confounding. In order to further test the reliability of our approach, we test our model on situations of limited data. We find that our model provides more accurate estimates of RR (least sum absolute error from ground truth) compared to benchmarks for risk ratio estimation on high-dimensional EHR across experiments. Finally, we apply our model to investigate the original case study: antihypertensives' effect on cancer and demonstrate that our model generally captures the validated null association.
title Targeted-BEHRT: Deep learning for observational causal inference on longitudinal electronic health records
topic Machine Learning
url https://arxiv.org/abs/2202.03487