Saved in:
Bibliographic Details
Main Authors: Vo, Hung Q., Zare, Samira, Ly, Son T., Wang, Lin, Ezeana, Chika F., Yu, Xiaohui, Wong, Kelvin K., Wong, Stephen T. C., Nguyen, Hien V.
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.06759
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914275915726848
author Vo, Hung Q.
Zare, Samira
Ly, Son T.
Wang, Lin
Ezeana, Chika F.
Yu, Xiaohui
Wong, Kelvin K.
Wong, Stephen T. C.
Nguyen, Hien V.
author_facet Vo, Hung Q.
Zare, Samira
Ly, Son T.
Wang, Lin
Ezeana, Chika F.
Yu, Xiaohui
Wong, Kelvin K.
Wong, Stephen T. C.
Nguyen, Hien V.
contents Achieving health equity in Artificial Intelligence (AI) requires diagnostic models that maintain reliability across diverse populations. However, breast cancer screening systems frequently suffer from domain overfitting, degrading significantly when deployed to varying demographics. While Invariant Learning algorithms aim to mitigate this by suppressing site-specific correlations, their efficacy in medical imaging remains underexplored. This study comprehensively evaluates domain generalization techniques for mammography. We constructed a multi-source training environment aggregating datasets from the United States (CBIS-DDSM, EMBED), Portugal (INbreast, BCDR), and Cyprus (BMCD). To assess global generalizability, we evaluated performance on unseen cohorts from Egypt (CDD-CESM) and Sweden (CSAW-CC). We benchmarked Invariant Risk Minimization (IRM) and Variance Risk Extrapolation (VREx) against a rigorously optimized Empirical Risk Minimization (ERM) baseline. Contrary to expectations, standard ERM consistently outperformed specialized invariant mechanisms on out-of-domain testing. While VREx showed potential in stabilizing attention maps, invariant objectives proved unstable and prone to underfitting. We conclude that engineering equitable AI is currently best served by maximizing multi-national data diversity rather than relying on complex algorithmic invariance.
format Preprint
id arxiv_https___arxiv_org_abs_2503_06759
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Revisiting Invariant Learning for Out-of-Domain Generalization on Multi-Site Mammogram Datasets
Vo, Hung Q.
Zare, Samira
Ly, Son T.
Wang, Lin
Ezeana, Chika F.
Yu, Xiaohui
Wong, Kelvin K.
Wong, Stephen T. C.
Nguyen, Hien V.
Computer Vision and Pattern Recognition
Achieving health equity in Artificial Intelligence (AI) requires diagnostic models that maintain reliability across diverse populations. However, breast cancer screening systems frequently suffer from domain overfitting, degrading significantly when deployed to varying demographics. While Invariant Learning algorithms aim to mitigate this by suppressing site-specific correlations, their efficacy in medical imaging remains underexplored. This study comprehensively evaluates domain generalization techniques for mammography. We constructed a multi-source training environment aggregating datasets from the United States (CBIS-DDSM, EMBED), Portugal (INbreast, BCDR), and Cyprus (BMCD). To assess global generalizability, we evaluated performance on unseen cohorts from Egypt (CDD-CESM) and Sweden (CSAW-CC). We benchmarked Invariant Risk Minimization (IRM) and Variance Risk Extrapolation (VREx) against a rigorously optimized Empirical Risk Minimization (ERM) baseline. Contrary to expectations, standard ERM consistently outperformed specialized invariant mechanisms on out-of-domain testing. While VREx showed potential in stabilizing attention maps, invariant objectives proved unstable and prone to underfitting. We conclude that engineering equitable AI is currently best served by maximizing multi-national data diversity rather than relying on complex algorithmic invariance.
title Revisiting Invariant Learning for Out-of-Domain Generalization on Multi-Site Mammogram Datasets
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2503.06759