Saved in:
Bibliographic Details
Main Authors: Feldman, Shai, Bates, Stephen, Romano, Yaniv
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2505.04733
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918357720104960
author Feldman, Shai
Bates, Stephen
Romano, Yaniv
author_facet Feldman, Shai
Bates, Stephen
Romano, Yaniv
contents We introduce a framework for robust uncertainty quantification in situations where labeled training data are corrupted, through noisy or missing labels. We build on conformal prediction, a statistical tool for generating prediction sets that cover the test label with a pre-specified probability. The validity of conformal prediction, however, holds under the i.i.d assumption, which does not hold in our setting due to the corruptions in the data. To account for this distribution shift, the privileged conformal prediction (PCP) method proposed leveraging privileged information (PI) -- additional features available only during training -- to re-weight the data distribution, yielding valid prediction sets under the assumption that the weights are accurate. In this work, we analyze the robustness of PCP to inaccuracies in the weights. Our analysis indicates that PCP can still yield valid uncertainty estimates even when the weights are poorly estimated. Furthermore, we introduce uncertain imputation (UI), a new conformal method that does not rely on weight estimation. Instead, we impute corrupted labels in a way that preserves their uncertainty. Our approach is supported by theoretical guarantees and validated empirically on both synthetic and real benchmarks. Finally, we show that these techniques can be integrated into a triply robust framework, ensuring statistically valid predictions as long as at least one underlying method is valid.
format Preprint
id arxiv_https___arxiv_org_abs_2505_04733
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Conformal Prediction with Corrupted Labels: Uncertain Imputation and Robust Re-weighting
Feldman, Shai
Bates, Stephen
Romano, Yaniv
Machine Learning
We introduce a framework for robust uncertainty quantification in situations where labeled training data are corrupted, through noisy or missing labels. We build on conformal prediction, a statistical tool for generating prediction sets that cover the test label with a pre-specified probability. The validity of conformal prediction, however, holds under the i.i.d assumption, which does not hold in our setting due to the corruptions in the data. To account for this distribution shift, the privileged conformal prediction (PCP) method proposed leveraging privileged information (PI) -- additional features available only during training -- to re-weight the data distribution, yielding valid prediction sets under the assumption that the weights are accurate. In this work, we analyze the robustness of PCP to inaccuracies in the weights. Our analysis indicates that PCP can still yield valid uncertainty estimates even when the weights are poorly estimated. Furthermore, we introduce uncertain imputation (UI), a new conformal method that does not rely on weight estimation. Instead, we impute corrupted labels in a way that preserves their uncertainty. Our approach is supported by theoretical guarantees and validated empirically on both synthetic and real benchmarks. Finally, we show that these techniques can be integrated into a triply robust framework, ensuring statistically valid predictions as long as at least one underlying method is valid.
title Conformal Prediction with Corrupted Labels: Uncertain Imputation and Robust Re-weighting
topic Machine Learning
url https://arxiv.org/abs/2505.04733