Guardado en:
Detalles Bibliográficos
Autores principales: Prijatelj, Derek S., Ireland, Timothy J., Scheirer, Walter J.
Formato: Preprint
Publicado: 2025
Materias:
Acceso en línea:https://arxiv.org/abs/2501.09331
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866917411001729024
author Prijatelj, Derek S.
Ireland, Timothy J.
Scheirer, Walter J.
author_facet Prijatelj, Derek S.
Ireland, Timothy J.
Scheirer, Walter J.
contents A machine that learns a task from observations must encounter and process uncertainty and novelty, especially when it is to maintain performance when observing new information and to select the hypothesis that best fits the current observations. In this context, some key questions arise: what and how much information did the observations provide, how much information is required to identify the data-generating process, how many observations remain to get that information, and how does a predictor determine that it has observed novel information? We formalize identifying information to answer these questions and synthesize prior works. Identifying information are bits that verify or falsify a hypothesis as the data-generating process. In this formalization, we prove the information theoretic characteristics of the computation of hypothesis identification and the resulting sample complexity. We define hypothesis identification and sample complexity via the computation of an indicator function over a set of hypotheses, bridging algorithmic and probabilistic information. We detail the sample complexity and its properties for data-generating processes ranging from deterministic processes to ergodic stationary stochastic processes, which connect the notion of identifying information in finite steps with asymptotic statistics and PAC-learning. The indicator function's computation naturally formalizes novel information and its identification from observations with respect to a hypothesis set, which detects a misspecified hypothesis set. We also proved that a computable PAC-Bayes learners' sample complexity distribution is determined by its moments in terms of the prior probability distribution over a fixed finite hypothesis set, and thus an approximation of the sample complexity distribution is always computable within the desired precision that resources allow.
format Preprint
id arxiv_https___arxiv_org_abs_2501_09331
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Identifying Information from Observations with Uncertainty and Novelty
Prijatelj, Derek S.
Ireland, Timothy J.
Scheirer, Walter J.
Machine Learning
G.3
A machine that learns a task from observations must encounter and process uncertainty and novelty, especially when it is to maintain performance when observing new information and to select the hypothesis that best fits the current observations. In this context, some key questions arise: what and how much information did the observations provide, how much information is required to identify the data-generating process, how many observations remain to get that information, and how does a predictor determine that it has observed novel information? We formalize identifying information to answer these questions and synthesize prior works. Identifying information are bits that verify or falsify a hypothesis as the data-generating process. In this formalization, we prove the information theoretic characteristics of the computation of hypothesis identification and the resulting sample complexity. We define hypothesis identification and sample complexity via the computation of an indicator function over a set of hypotheses, bridging algorithmic and probabilistic information. We detail the sample complexity and its properties for data-generating processes ranging from deterministic processes to ergodic stationary stochastic processes, which connect the notion of identifying information in finite steps with asymptotic statistics and PAC-learning. The indicator function's computation naturally formalizes novel information and its identification from observations with respect to a hypothesis set, which detects a misspecified hypothesis set. We also proved that a computable PAC-Bayes learners' sample complexity distribution is determined by its moments in terms of the prior probability distribution over a fixed finite hypothesis set, and thus an approximation of the sample complexity distribution is always computable within the desired precision that resources allow.
title Identifying Information from Observations with Uncertainty and Novelty
topic Machine Learning
G.3
url https://arxiv.org/abs/2501.09331