Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Shu, Han-Jay, Chiu, Wei-Ning, Chang, Shun-Ting, Huang, Meng-Ping, Tohyama, Takeshi, Han, Ahram, Kuo, Po-Chih
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2510.01683
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911188901691392
author	Shu, Han-Jay Chiu, Wei-Ning Chang, Shun-Ting Huang, Meng-Ping Tohyama, Takeshi Han, Ahram Kuo, Po-Chih
author_facet	Shu, Han-Jay Chiu, Wei-Ning Chang, Shun-Ting Huang, Meng-Ping Tohyama, Takeshi Han, Ahram Kuo, Po-Chih
contents	Deep learning models achieve strong performance in chest radiograph (CXR) interpretation, yet fairness and reliability concerns persist. Models often show uneven accuracy across patient subgroups, leading to hidden failures not reflected in aggregate metrics. Existing error detection approaches -- based on confidence calibration or out-of-distribution (OOD) detection -- struggle with subtle within-distribution errors, while image- and representation-level consistency-based methods remain underexplored in medical imaging. We propose an augmentation-sensitivity risk scoring (ASRS) framework to identify error-prone CXR cases. ASRS applies clinically plausible rotations ($\pm 15^\circ$/$\pm 30^\circ$) and measures embedding shifts with the RAD-DINO encoder. Sensitivity scores stratify samples into stability quartiles, where highly sensitive cases show substantially lower recall ($-0.2$ to $-0.3$) despite high AUROC and confidence. ASRS provides a label-free means for selective prediction and clinician review, improving fairness and safety in medical AI.
format	Preprint
id	arxiv_https___arxiv_org_abs_2510_01683
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Uncovering Overconfident Failures in CXR Models via Augmentation-Sensitivity Risk Scoring Shu, Han-Jay Chiu, Wei-Ning Chang, Shun-Ting Huang, Meng-Ping Tohyama, Takeshi Han, Ahram Kuo, Po-Chih Computer Vision and Pattern Recognition Deep learning models achieve strong performance in chest radiograph (CXR) interpretation, yet fairness and reliability concerns persist. Models often show uneven accuracy across patient subgroups, leading to hidden failures not reflected in aggregate metrics. Existing error detection approaches -- based on confidence calibration or out-of-distribution (OOD) detection -- struggle with subtle within-distribution errors, while image- and representation-level consistency-based methods remain underexplored in medical imaging. We propose an augmentation-sensitivity risk scoring (ASRS) framework to identify error-prone CXR cases. ASRS applies clinically plausible rotations ($\pm 15^\circ$/$\pm 30^\circ$) and measures embedding shifts with the RAD-DINO encoder. Sensitivity scores stratify samples into stability quartiles, where highly sensitive cases show substantially lower recall ($-0.2$ to $-0.3$) despite high AUROC and confidence. ASRS provides a label-free means for selective prediction and clinician review, improving fairness and safety in medical AI.
title	Uncovering Overconfident Failures in CXR Models via Augmentation-Sensitivity Risk Scoring
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2510.01683

Similar Items