Saved in:
Bibliographic Details
Main Authors: Nixon, Michelle Pistner, McGovern, Kyle C., Letourneau, Jeffrey, David, Lawrence A., Lazar, Nicole A., Mukherjee, Sayan, Silverman, Justin D.
Format: Preprint
Published: 2022
Subjects:
Online Access:https://arxiv.org/abs/2201.03616
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929575257178112
author Nixon, Michelle Pistner
McGovern, Kyle C.
Letourneau, Jeffrey
David, Lawrence A.
Lazar, Nicole A.
Mukherjee, Sayan
Silverman, Justin D.
author_facet Nixon, Michelle Pistner
McGovern, Kyle C.
Letourneau, Jeffrey
David, Lawrence A.
Lazar, Nicole A.
Mukherjee, Sayan
Silverman, Justin D.
contents Many scientific fields, including human gut microbiome science, collect multivariate count data where the sum of the counts is unrelated to the scale of the underlying system being measured (e.g., total microbial load in a subject's colon). This disconnect complicates downstream analyses such as differential analysis in case-control studies. This article is motivated by a novel study of in vitro human gut microbiome models. Popular tools for analyzing these data led to dramatically elevated rates of both false positives and false negatives. To understand those failures, we provide a formal problem statement that frames these challenges of scale in terms of the classical theory of identifiability. We call this the problem of Scale Reliant Inference (SRI). We use this formulation to prove fundamental limits on SRI in terms of criteria such as consistency and type-I error control. We show that the failures of existing methods stem from a fundamental failure to properly quantify uncertainty in the system scale. We demonstrate that a particular type of Bayesian model called a Bayesian Partially Identified Model (PIMs) can correctly quantify uncertainty in SRI. We introduce Scale Simulation Random Variables (SSRVs) as a flexible and efficient approach to specifying and inferring Bayesian PIMs. In the context of both real and simulated data, we find SSRVs drastically decrease type-I and type-II error rates.
format Preprint
id arxiv_https___arxiv_org_abs_2201_03616
institution arXiv
publishDate 2022
record_format arxiv
spellingShingle Scale Reliant Inference
Nixon, Michelle Pistner
McGovern, Kyle C.
Letourneau, Jeffrey
David, Lawrence A.
Lazar, Nicole A.
Mukherjee, Sayan
Silverman, Justin D.
Methodology
Many scientific fields, including human gut microbiome science, collect multivariate count data where the sum of the counts is unrelated to the scale of the underlying system being measured (e.g., total microbial load in a subject's colon). This disconnect complicates downstream analyses such as differential analysis in case-control studies. This article is motivated by a novel study of in vitro human gut microbiome models. Popular tools for analyzing these data led to dramatically elevated rates of both false positives and false negatives. To understand those failures, we provide a formal problem statement that frames these challenges of scale in terms of the classical theory of identifiability. We call this the problem of Scale Reliant Inference (SRI). We use this formulation to prove fundamental limits on SRI in terms of criteria such as consistency and type-I error control. We show that the failures of existing methods stem from a fundamental failure to properly quantify uncertainty in the system scale. We demonstrate that a particular type of Bayesian model called a Bayesian Partially Identified Model (PIMs) can correctly quantify uncertainty in SRI. We introduce Scale Simulation Random Variables (SSRVs) as a flexible and efficient approach to specifying and inferring Bayesian PIMs. In the context of both real and simulated data, we find SSRVs drastically decrease type-I and type-II error rates.
title Scale Reliant Inference
topic Methodology
url https://arxiv.org/abs/2201.03616