Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Ramachandran, Rahul, Kulkarni, Tejal, Sharma, Charchit, Vijaykeerthy, Deepak, Balasubramanian, Vineeth N
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2409.04041
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913492783595520
author	Ramachandran, Rahul Kulkarni, Tejal Sharma, Charchit Vijaykeerthy, Deepak Balasubramanian, Vineeth N
author_facet	Ramachandran, Rahul Kulkarni, Tejal Sharma, Charchit Vijaykeerthy, Deepak Balasubramanian, Vineeth N
contents	Evaluating models and datasets in computer vision remains a challenging task, with most leaderboards relying solely on accuracy. While accuracy is a popular metric for model evaluation, it provides only a coarse assessment by considering a single model's score on all dataset items. This paper explores Item Response Theory (IRT), a framework that infers interpretable latent parameters for an ensemble of models and each dataset item, enabling richer evaluation and analysis beyond the single accuracy number. Leveraging IRT, we assess model calibration, select informative data subsets, and demonstrate the usefulness of its latent parameters for analyzing and comparing models and datasets in computer vision.
format	Preprint
id	arxiv_https___arxiv_org_abs_2409_04041
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	On Evaluation of Vision Datasets and Models using Human Competency Frameworks Ramachandran, Rahul Kulkarni, Tejal Sharma, Charchit Vijaykeerthy, Deepak Balasubramanian, Vineeth N Computer Vision and Pattern Recognition Evaluating models and datasets in computer vision remains a challenging task, with most leaderboards relying solely on accuracy. While accuracy is a popular metric for model evaluation, it provides only a coarse assessment by considering a single model's score on all dataset items. This paper explores Item Response Theory (IRT), a framework that infers interpretable latent parameters for an ensemble of models and each dataset item, enabling richer evaluation and analysis beyond the single accuracy number. Leveraging IRT, we assess model calibration, select informative data subsets, and demonstrate the usefulness of its latent parameters for analyzing and comparing models and datasets in computer vision.
title	On Evaluation of Vision Datasets and Models using Human Competency Frameworks
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2409.04041

Similar Items