Saved in:
Bibliographic Details
Main Authors: Amann, Nicolai, Leeb, Hannes, Steinberger, Lukas
Format: Preprint
Published: 2023
Subjects:
Online Access:https://arxiv.org/abs/2312.14596
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Recently, there has been substantial interest in statistical guarantees for cross-validation (CV) methods of uncertainty quantification in statistical learning (cf. Barber et al. 2021a, Liang and Barber 2024, Steinberger and Leeb 2023). These guarantees should hold under minimal assumptions on the data generating process and conditional on the training data, because numerous predictions are usually computed based on one and the same training sample. We push this objective to the limit: We prove asymptotic conditional conservativeness of CV, that is, the probability of the actual coverage probability, conditional on the training data, undershooting its nominal level vanishes asymptotically, under minimal assumptions. In particular, we impose a stability condition, require that the prediction error is stochastically bounded, and show that neither condition can be dropped in general. By way of an asymptotic equivalence result, we also show that the closely related CV+ method of Barber et al. (2021a) provides exactly the same conditional statistical guarantees as CV in large samples, thereby extending the range of applicability of CV+ to the high-dimensional regime. We conclude that, in view of its marginal coverage guarantee, CV+ does indeed improve over simple CV. For our proofs we introduce a new concept called Lévy gauge, which can be of independent interest.