Saved in:
Bibliographic Details
Main Authors: Zhang, Anyu, Hu, Jingzhen, Liang, Qingzhong, Dimitrova, Elena S., Stigler, Brandilyn
Format: Preprint
Published: 2021
Subjects:
Online Access:https://arxiv.org/abs/2101.09384
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917839136358400
author Zhang, Anyu
Hu, Jingzhen
Liang, Qingzhong
Dimitrova, Elena S.
Stigler, Brandilyn
author_facet Zhang, Anyu
Hu, Jingzhen
Liang, Qingzhong
Dimitrova, Elena S.
Stigler, Brandilyn
contents Design of experiments and model selection, though essential steps in data science, are usually viewed as unrelated processes in the study and analysis of biological networks. Not accounting for their inter-relatedness has the potential to introduce bias and increase the risk of missing salient features in the modeling process. We propose a data-driven computational framework to unify experimental design and model selection for discrete data sets and minimal polynomial models. We use a special affine transformation, called a linear shift, to provide both the data sets and the polynomial terms that form a basis for a model. This framework enables us to address two important questions that arise in biological data science research: finding the data which identify a set of known interactions and finding identifiable interactions given a set of data. We present the theoretical foundation for a web-accessible database. As an example, we apply this methodology to a previously constructed pharmacodynamic model of epidermal derived growth factor receptor (EGFR) signaling.
format Preprint
id arxiv_https___arxiv_org_abs_2101_09384
institution arXiv
publishDate 2021
record_format arxiv
spellingShingle Algebraic Model Selection and Experimental Design in Biological Data Science
Zhang, Anyu
Hu, Jingzhen
Liang, Qingzhong
Dimitrova, Elena S.
Stigler, Brandilyn
Algebraic Geometry
Quantitative Methods
13P10, 14G15, 93B15, 93B20
Design of experiments and model selection, though essential steps in data science, are usually viewed as unrelated processes in the study and analysis of biological networks. Not accounting for their inter-relatedness has the potential to introduce bias and increase the risk of missing salient features in the modeling process. We propose a data-driven computational framework to unify experimental design and model selection for discrete data sets and minimal polynomial models. We use a special affine transformation, called a linear shift, to provide both the data sets and the polynomial terms that form a basis for a model. This framework enables us to address two important questions that arise in biological data science research: finding the data which identify a set of known interactions and finding identifiable interactions given a set of data. We present the theoretical foundation for a web-accessible database. As an example, we apply this methodology to a previously constructed pharmacodynamic model of epidermal derived growth factor receptor (EGFR) signaling.
title Algebraic Model Selection and Experimental Design in Biological Data Science
topic Algebraic Geometry
Quantitative Methods
13P10, 14G15, 93B15, 93B20
url https://arxiv.org/abs/2101.09384