Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	René, Alexandre, Longtin, André
Format:	Preprint
Published:	2024
Subjects:	Methodology
Online Access:	https://arxiv.org/abs/2408.13414
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912669010755584
author	René, Alexandre Longtin, André
author_facet	René, Alexandre Longtin, André
contents	Fitting models to data is an important part of the practice of science. Advances in machine learning have made it possible to fit more -- and more complex -- models, but have also exacerbated a problem: when multiple models fit the data equally well, which one(s) should we pick? The answer depends entirely on the modelling goal. In the scientific context, the essential goal is _replicability_: if a model works well to describe one experiment, it should continue to do so when that experiment is replicated tomorrow, or in another laboratory. The selection criterion must therefore be robust to the variations inherent to the replication process. In this work we develop a nonparametric method for estimating uncertainty on a model's empirical risk when replications are non-stationary, thus ensuring that a model is only rejected when another is _reproducibly_ better. We illustrate the method with two examples: one a more classical setting, where the models are structurally distinct, and a machine learning-inspired setting, where they differ only in the value of their parameters. We show how, in this context of replicability or "epistemic uncertainty", it compares favourably to existing model selection criteria, and has more satisfactory behaviour with large experimental datasets.
format	Preprint
id	arxiv_https___arxiv_org_abs_2408_13414
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Selecting fitted models under epistemic uncertainty using a stochastic process on quantile functions René, Alexandre Longtin, André Methodology Fitting models to data is an important part of the practice of science. Advances in machine learning have made it possible to fit more -- and more complex -- models, but have also exacerbated a problem: when multiple models fit the data equally well, which one(s) should we pick? The answer depends entirely on the modelling goal. In the scientific context, the essential goal is _replicability_: if a model works well to describe one experiment, it should continue to do so when that experiment is replicated tomorrow, or in another laboratory. The selection criterion must therefore be robust to the variations inherent to the replication process. In this work we develop a nonparametric method for estimating uncertainty on a model's empirical risk when replications are non-stationary, thus ensuring that a model is only rejected when another is _reproducibly_ better. We illustrate the method with two examples: one a more classical setting, where the models are structurally distinct, and a machine learning-inspired setting, where they differ only in the value of their parameters. We show how, in this context of replicability or "epistemic uncertainty", it compares favourably to existing model selection criteria, and has more satisfactory behaviour with large experimental datasets.
title	Selecting fitted models under epistemic uncertainty using a stochastic process on quantile functions
topic	Methodology
url	https://arxiv.org/abs/2408.13414

Similar Items