Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Marek, Martin, Paige, Brooks, Izmailov, Pavel
Format:	Preprint
Publié:	2024
Sujets:	Machine Learning
Accès en ligne:	https://arxiv.org/abs/2403.01272
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866914699922112512
author	Marek, Martin Paige, Brooks Izmailov, Pavel
author_facet	Marek, Martin Paige, Brooks Izmailov, Pavel
contents	Benchmark datasets used for image classification tend to have very low levels of label noise. When Bayesian neural networks are trained on these datasets, they often underfit, misrepresenting the aleatoric uncertainty of the data. A common solution is to cool the posterior, which improves fit to the training data but is challenging to interpret from a Bayesian perspective. We explore whether posterior tempering can be replaced by a confidence-inducing prior distribution. First, we introduce a "DirClip" prior that is practical to sample and nearly matches the performance of a cold posterior. Second, we introduce a "confidence prior" that directly approximates a cold likelihood in the limit of decreasing temperature but cannot be easily sampled. Lastly, we provide several general insights into confidence-inducing priors, such as when they might diverge and how fine-tuning can mitigate numerical instability.
format	Preprint
id	arxiv_https___arxiv_org_abs_2403_01272
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Can a Confident Prior Replace a Cold Posterior? Marek, Martin Paige, Brooks Izmailov, Pavel Machine Learning Benchmark datasets used for image classification tend to have very low levels of label noise. When Bayesian neural networks are trained on these datasets, they often underfit, misrepresenting the aleatoric uncertainty of the data. A common solution is to cool the posterior, which improves fit to the training data but is challenging to interpret from a Bayesian perspective. We explore whether posterior tempering can be replaced by a confidence-inducing prior distribution. First, we introduce a "DirClip" prior that is practical to sample and nearly matches the performance of a cold posterior. Second, we introduce a "confidence prior" that directly approximates a cold likelihood in the limit of decreasing temperature but cannot be easily sampled. Lastly, we provide several general insights into confidence-inducing priors, such as when they might diverge and how fine-tuning can mitigate numerical instability.
title	Can a Confident Prior Replace a Cold Posterior?
topic	Machine Learning
url	https://arxiv.org/abs/2403.01272

Documents similaires