Saved in:
Bibliographic Details
Main Authors: Franck, Christopher T., Driscoll, Anne R., Szajnfarber, Zoe, Woodall, William H.
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2510.25573
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911239534280704
author Franck, Christopher T.
Driscoll, Anne R.
Szajnfarber, Zoe
Woodall, William H.
author_facet Franck, Christopher T.
Driscoll, Anne R.
Szajnfarber, Zoe
Woodall, William H.
contents Machine learning approaches for image classification have led to impressive advances in that field. For example, convolutional neural networks are able to achieve remarkable image classification accuracy across a wide range of applications in industry, defense, and other areas. While these machine learning models boast impressive accuracy, a related concern is how to assess and maintain calibration in the predictions these models make. A classification model is said to be well calibrated if its predicted probabilities correspond with the rates events actually occur. While there are many available methods to assess machine learning calibration and recalibrate faulty predictions, less effort has been spent on developing approaches that continually monitor predictive models for potential loss of calibration as time passes. We propose a cumulative sum-based approach with dynamic limits that enable detection of miscalibration in both traditional process monitoring and concept drift applications. This enables early detection of operational context changes that impact image classification performance in the field. The proposed chart can be used broadly in any situation where the user needs to monitor probability predictions over time for potential lapses in calibration. Importantly, our method operates on probability predictions and event outcomes and does not require under-the-hood access to the machine learning model.
format Preprint
id arxiv_https___arxiv_org_abs_2510_25573
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Monitoring the calibration of probability forecasts with an application to concept drift detection involving image classification
Franck, Christopher T.
Driscoll, Anne R.
Szajnfarber, Zoe
Woodall, William H.
Machine Learning
Machine learning approaches for image classification have led to impressive advances in that field. For example, convolutional neural networks are able to achieve remarkable image classification accuracy across a wide range of applications in industry, defense, and other areas. While these machine learning models boast impressive accuracy, a related concern is how to assess and maintain calibration in the predictions these models make. A classification model is said to be well calibrated if its predicted probabilities correspond with the rates events actually occur. While there are many available methods to assess machine learning calibration and recalibrate faulty predictions, less effort has been spent on developing approaches that continually monitor predictive models for potential loss of calibration as time passes. We propose a cumulative sum-based approach with dynamic limits that enable detection of miscalibration in both traditional process monitoring and concept drift applications. This enables early detection of operational context changes that impact image classification performance in the field. The proposed chart can be used broadly in any situation where the user needs to monitor probability predictions over time for potential lapses in calibration. Importantly, our method operates on probability predictions and event outcomes and does not require under-the-hood access to the machine learning model.
title Monitoring the calibration of probability forecasts with an application to concept drift detection involving image classification
topic Machine Learning
url https://arxiv.org/abs/2510.25573