Saved in:
Bibliographic Details
Main Authors: Fröhlich, Christian, Williamson, Robert C.
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2404.09741
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909450184425472
author Fröhlich, Christian
Williamson, Robert C.
author_facet Fröhlich, Christian
Williamson, Robert C.
contents Motivated by recently emerging problems in machine learning and statistics, we propose data models which relax the familiar i.i.d. assumption. In essence, we seek to understand what it means for data to come from a set of probability measures. We show that our frequentist data models, parameterized by such sets, manifest two aspects of imprecision. We characterize the intricate interplay of these manifestations, aggregate (ir)regularity and local (ir)regularity, where a much richer set of behaviours compared to an i.i.d. model is possible. In doing so we shed new light on the relationship between non-stationary, locally precise and stationary, locally imprecise data models. We discuss possible applications of these data models in machine learning and how the set of probabilities can be estimated. For the estimation of aggregate irregularity, we provide a negative result but argue that it does not warrant pessimism. Understanding these frequentist aspects of imprecise probabilities paves the way for deriving generalization of proper scoring rules and calibration to the imprecise case, which can then contribute to tackling practical problems.
format Preprint
id arxiv_https___arxiv_org_abs_2404_09741
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Data Models With Two Manifestations of Imprecision
Fröhlich, Christian
Williamson, Robert C.
Statistics Theory
Motivated by recently emerging problems in machine learning and statistics, we propose data models which relax the familiar i.i.d. assumption. In essence, we seek to understand what it means for data to come from a set of probability measures. We show that our frequentist data models, parameterized by such sets, manifest two aspects of imprecision. We characterize the intricate interplay of these manifestations, aggregate (ir)regularity and local (ir)regularity, where a much richer set of behaviours compared to an i.i.d. model is possible. In doing so we shed new light on the relationship between non-stationary, locally precise and stationary, locally imprecise data models. We discuss possible applications of these data models in machine learning and how the set of probabilities can be estimated. For the estimation of aggregate irregularity, we provide a negative result but argue that it does not warrant pessimism. Understanding these frequentist aspects of imprecise probabilities paves the way for deriving generalization of proper scoring rules and calibration to the imprecise case, which can then contribute to tackling practical problems.
title Data Models With Two Manifestations of Imprecision
topic Statistics Theory
url https://arxiv.org/abs/2404.09741