Saved in:
Bibliographic Details
Main Authors: Torabian, Alireza, Urner, Ruth
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2412.00943
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929609695559680
author Torabian, Alireza
Urner, Ruth
author_facet Torabian, Alireza
Urner, Ruth
contents Calibration is a frequently invoked concept when useful label probability estimates are required on top of classification accuracy. A calibrated model is a function whose values correctly reflect underlying label probabilities. Calibration in itself however does not imply classification accuracy, nor human interpretable estimates, nor is it straightforward to verify calibration from finite data. There is a plethora of evaluation metrics (and loss functions) that each assess a specific aspect of a calibration model. In this work, we initiate an axiomatic study of the notion of calibration. We catalogue desirable properties of calibrated models as well as corresponding evaluation metrics and analyze their feasibility and correspondences. We complement this analysis with an empirical evaluation, comparing common calibration methods to employing a simple, interpretable decision tree.
format Preprint
id arxiv_https___arxiv_org_abs_2412_00943
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Calibration through the Lens of Interpretability
Torabian, Alireza
Urner, Ruth
Machine Learning
Calibration is a frequently invoked concept when useful label probability estimates are required on top of classification accuracy. A calibrated model is a function whose values correctly reflect underlying label probabilities. Calibration in itself however does not imply classification accuracy, nor human interpretable estimates, nor is it straightforward to verify calibration from finite data. There is a plethora of evaluation metrics (and loss functions) that each assess a specific aspect of a calibration model. In this work, we initiate an axiomatic study of the notion of calibration. We catalogue desirable properties of calibrated models as well as corresponding evaluation metrics and analyze their feasibility and correspondences. We complement this analysis with an empirical evaluation, comparing common calibration methods to employing a simple, interpretable decision tree.
title Calibration through the Lens of Interpretability
topic Machine Learning
url https://arxiv.org/abs/2412.00943