Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Spisak, Tamas
Format:	Preprint
Published:	2021
Subjects:	Machine Learning Quantitative Methods G.3; I.2.1
Online Access:	https://arxiv.org/abs/2111.00814
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910972922298368
author	Spisak, Tamas
author_facet	Spisak, Tamas
contents	The lack of non-parametric statistical tests for confounding bias significantly hampers the development of robust, valid and generalizable predictive models in many fields of research. Here I propose the partial and full confounder tests, which, for a given confounder variable, probe the null hypotheses of unconfounded and fully confounded models, respectively. The tests provide a strict control for Type I errors and high statistical power, even for non-normally and non-linearly dependent predictions, often seen in machine learning. Applying the proposed tests on models trained on functional brain connectivity data from the Human Connectome Project and the Autism Brain Imaging Data Exchange dataset reveals confounders that were previously unreported or found to be hard to correct for with state-of-the-art confound mitigation approaches. The tests, implemented in the package mlconfound (https://mlconfound.readthedocs.io), can aid the assessment and improvement of the generalizability and neurobiological validity of predictive models and, thereby, foster the development of clinically useful machine learning biomarkers.
format	Preprint
id	arxiv_https___arxiv_org_abs_2111_00814
institution	arXiv
publishDate	2021
record_format	arxiv
spellingShingle	Statistical quantification of confounding bias in predictive modelling Spisak, Tamas Machine Learning Quantitative Methods G.3; I.2.1 The lack of non-parametric statistical tests for confounding bias significantly hampers the development of robust, valid and generalizable predictive models in many fields of research. Here I propose the partial and full confounder tests, which, for a given confounder variable, probe the null hypotheses of unconfounded and fully confounded models, respectively. The tests provide a strict control for Type I errors and high statistical power, even for non-normally and non-linearly dependent predictions, often seen in machine learning. Applying the proposed tests on models trained on functional brain connectivity data from the Human Connectome Project and the Autism Brain Imaging Data Exchange dataset reveals confounders that were previously unreported or found to be hard to correct for with state-of-the-art confound mitigation approaches. The tests, implemented in the package mlconfound (https://mlconfound.readthedocs.io), can aid the assessment and improvement of the generalizability and neurobiological validity of predictive models and, thereby, foster the development of clinically useful machine learning biomarkers.
title	Statistical quantification of confounding bias in predictive modelling
topic	Machine Learning Quantitative Methods G.3; I.2.1
url	https://arxiv.org/abs/2111.00814

Similar Items