Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Rockenschaub, Patrick, Xian, Zhicong, Zamanian, Alireza, Piperno, Marta, Ciora, Octavia-Andreea, Pachl, Elisabeth, Ahmidi, Narges
Format: Preprint
Veröffentlicht: 2024
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2406.16484
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866917703394000896
author Rockenschaub, Patrick
Xian, Zhicong
Zamanian, Alireza
Piperno, Marta
Ciora, Octavia-Andreea
Pachl, Elisabeth
Ahmidi, Narges
author_facet Rockenschaub, Patrick
Xian, Zhicong
Zamanian, Alireza
Piperno, Marta
Ciora, Octavia-Andreea
Pachl, Elisabeth
Ahmidi, Narges
contents Prediction becomes more challenging with missing covariates. What method is chosen to handle missingness can greatly affect how models perform. In many real-world problems, the best prediction performance is achieved by models that can leverage the informative nature of a value being missing. Yet, the reasons why a covariate goes missing can change once a model is deployed in practice. If such a missingness shift occurs, the conditional probability of a value being missing differs in the target data. Prediction performance in the source data may no longer be a good selection criterion, and approaches that do not rely on informative missingness may be preferable. However, we show that the Bayes predictor remains unchanged by ignorable shifts for which the probability of missingness only depends on observed data. Any consistent estimator of the Bayes predictor may therefore result in robust prediction under those conditions, although we show empirically that different methods appear robust to different types of shifts. If the missingness shift is non-ignorable, the Bayes predictor may change due to the shift. While neither approach recovers the Bayes predictor in this case, we found empirically that disregarding missingness was most beneficial when it was highly informative.
format Preprint
id arxiv_https___arxiv_org_abs_2406_16484
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Robust prediction under missingness shifts
Rockenschaub, Patrick
Xian, Zhicong
Zamanian, Alireza
Piperno, Marta
Ciora, Octavia-Andreea
Pachl, Elisabeth
Ahmidi, Narges
Machine Learning
Prediction becomes more challenging with missing covariates. What method is chosen to handle missingness can greatly affect how models perform. In many real-world problems, the best prediction performance is achieved by models that can leverage the informative nature of a value being missing. Yet, the reasons why a covariate goes missing can change once a model is deployed in practice. If such a missingness shift occurs, the conditional probability of a value being missing differs in the target data. Prediction performance in the source data may no longer be a good selection criterion, and approaches that do not rely on informative missingness may be preferable. However, we show that the Bayes predictor remains unchanged by ignorable shifts for which the probability of missingness only depends on observed data. Any consistent estimator of the Bayes predictor may therefore result in robust prediction under those conditions, although we show empirically that different methods appear robust to different types of shifts. If the missingness shift is non-ignorable, the Bayes predictor may change due to the shift. While neither approach recovers the Bayes predictor in this case, we found empirically that disregarding missingness was most beneficial when it was highly informative.
title Robust prediction under missingness shifts
topic Machine Learning
url https://arxiv.org/abs/2406.16484