Saved in:
Bibliographic Details
Main Author: Kirsch, Andreas
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.02628
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909621602484224
author Kirsch, Andreas
author_facet Kirsch, Andreas
contents Epistemic uncertainty is crucial for safety-critical applications and data acquisition tasks. Yet, we find an important phenomenon in deep learning models: an epistemic uncertainty collapse as model complexity increases, challenging the assumption that larger models invariably offer better uncertainty quantification. We introduce implicit ensembling as a possible explanation for this phenomenon. To investigate this hypothesis, we provide theoretical analysis and experiments that demonstrate uncertainty collapse in explicit ensembles of ensembles and show experimental evidence of similar collapse in wider models across various architectures, from simple MLPs to state-of-the-art vision models including ResNets and Vision Transformers. We further develop implicit ensemble extraction techniques to decompose larger models into diverse sub-models, showing we can thus recover epistemic uncertainty. We explore the implications of these findings for uncertainty estimation.
format Preprint
id arxiv_https___arxiv_org_abs_2409_02628
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle (Implicit) Ensembles of Ensembles: Epistemic Uncertainty Collapse in Large Models
Kirsch, Andreas
Machine Learning
Epistemic uncertainty is crucial for safety-critical applications and data acquisition tasks. Yet, we find an important phenomenon in deep learning models: an epistemic uncertainty collapse as model complexity increases, challenging the assumption that larger models invariably offer better uncertainty quantification. We introduce implicit ensembling as a possible explanation for this phenomenon. To investigate this hypothesis, we provide theoretical analysis and experiments that demonstrate uncertainty collapse in explicit ensembles of ensembles and show experimental evidence of similar collapse in wider models across various architectures, from simple MLPs to state-of-the-art vision models including ResNets and Vision Transformers. We further develop implicit ensemble extraction techniques to decompose larger models into diverse sub-models, showing we can thus recover epistemic uncertainty. We explore the implications of these findings for uncertainty estimation.
title (Implicit) Ensembles of Ensembles: Epistemic Uncertainty Collapse in Large Models
topic Machine Learning
url https://arxiv.org/abs/2409.02628