Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Kivimäki, Juhani, Białek, Jakub, Nurminen, Jukka K., Kuberski, Wojtek
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2407.08649
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910822602637312
author	Kivimäki, Juhani Białek, Jakub Nurminen, Jukka K. Kuberski, Wojtek
author_facet	Kivimäki, Juhani Białek, Jakub Nurminen, Jukka K. Kuberski, Wojtek
contents	After a machine learning model has been deployed into production, its predictive performance needs to be monitored. Ideally, such monitoring can be carried out by comparing the model's predictions against ground truth labels. For this to be possible, the ground truth labels must be available relatively soon after inference. However, there are many use cases where ground truth labels are available only after a significant delay, or in the worst case, not at all. In such cases, directly monitoring the model's predictive performance is impossible. Recently, novel methods for estimating the predictive performance of a model when ground truth is unavailable have been developed. Many of these methods leverage model confidence or other uncertainty estimates and are experimentally compared against a naive baseline method, namely Average Confidence (AC), which estimates model accuracy as the average of confidence scores for a given set of predictions. However, until now the theoretical properties of the AC method have not been properly explored. In this paper, we try to fill this gap by reviewing the AC method and show that under certain general assumptions, it is an unbiased and consistent estimator of model accuracy with many desirable properties. We also compare this baseline estimator against some more complex estimators empirically and show that in many cases the AC method is able to beat the others, although the comparative quality of the different estimators is heavily case-dependent.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_08649
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Confidence-based Estimators for Predictive Performance in Model Monitoring Kivimäki, Juhani Białek, Jakub Nurminen, Jukka K. Kuberski, Wojtek Machine Learning Artificial Intelligence After a machine learning model has been deployed into production, its predictive performance needs to be monitored. Ideally, such monitoring can be carried out by comparing the model's predictions against ground truth labels. For this to be possible, the ground truth labels must be available relatively soon after inference. However, there are many use cases where ground truth labels are available only after a significant delay, or in the worst case, not at all. In such cases, directly monitoring the model's predictive performance is impossible. Recently, novel methods for estimating the predictive performance of a model when ground truth is unavailable have been developed. Many of these methods leverage model confidence or other uncertainty estimates and are experimentally compared against a naive baseline method, namely Average Confidence (AC), which estimates model accuracy as the average of confidence scores for a given set of predictions. However, until now the theoretical properties of the AC method have not been properly explored. In this paper, we try to fill this gap by reviewing the AC method and show that under certain general assumptions, it is an unbiased and consistent estimator of model accuracy with many desirable properties. We also compare this baseline estimator against some more complex estimators empirically and show that in many cases the AC method is able to beat the others, although the comparative quality of the different estimators is heavily case-dependent.
title	Confidence-based Estimators for Predictive Performance in Model Monitoring
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2407.08649

Similar Items