Saved in:
Bibliographic Details
Main Authors: Huang, Fuxiang, Zhu, Jiayi, Yu, Yunfang, Xie, Yu, Guo, Yuan, Kong, Qingcong, Wu, Mingxiang, Jiang, Xinrui, Yang, Shu, Ma, Jiabo, Liu, Ziyi, Xu, Zhe, Chen, Zhixuan, Tan, Yujie, He, Zifan, Mao, Luhui, Wang, Xi, Hou, Junlin, Zhang, Lei, Luo, Qiong, Li, Zhenhui, Yao, Herui, Chen, Hao
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.20271
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918147326476288
author Huang, Fuxiang
Zhu, Jiayi
Yu, Yunfang
Xie, Yu
Guo, Yuan
Kong, Qingcong
Wu, Mingxiang
Jiang, Xinrui
Yang, Shu
Ma, Jiabo
Liu, Ziyi
Xu, Zhe
Chen, Zhixuan
Tan, Yujie
He, Zifan
Mao, Luhui
Wang, Xi
Hou, Junlin
Zhang, Lei
Luo, Qiong
Li, Zhenhui
Yao, Herui
Chen, Hao
author_facet Huang, Fuxiang
Zhu, Jiayi
Yu, Yunfang
Xie, Yu
Guo, Yuan
Kong, Qingcong
Wu, Mingxiang
Jiang, Xinrui
Yang, Shu
Ma, Jiabo
Liu, Ziyi
Xu, Zhe
Chen, Zhixuan
Tan, Yujie
He, Zifan
Mao, Luhui
Wang, Xi
Hou, Junlin
Zhang, Lei
Luo, Qiong
Li, Zhenhui
Yao, Herui
Chen, Hao
contents Breast cancer is the most commonly diagnosed cancer and the leading cause of cancer-related mortality in women globally. Mammography is essential for the early detection and diagnosis of breast lesions. Despite recent progress in foundation models (FMs) for mammogram analysis, their clinical translation remains constrained by several fundamental limitations, including insufficient diversity in training data, limited model generalizability, and a lack of comprehensive evaluation across clinically relevant tasks. Here, we introduce VersaMammo, a versatile foundation model for mammograms, designed to overcome these limitations. We curated the largest multi-institutional mammogram dataset to date, comprising 706,239 images from 21 sources. To improve generalization, we propose a two-stage pre-training strategy to develop VersaMammo, a mammogram foundation model. First, a teacher model is trained via self-supervised learning to extract transferable features from unlabeled mammograms. Then, supervised learning combined with knowledge distillation transfers both features and clinical knowledge into VersaMammo. To ensure a comprehensive evaluation, we established a benchmark comprising 92 specific tasks, including 68 internal tasks and 24 external validation tasks, spanning 5 major clinical task categories: lesion detection, segmentation, classification, image retrieval, and visual question answering. VersaMammo achieves state-of-the-art performance, ranking first in 50 out of 68 specific internal tasks and 20 out of 24 external validation tasks, with average ranks of 1.5 and 1.2, respectively. These results demonstrate its superior generalization and clinical utility, offering a substantial advancement toward reliable and scalable breast cancer screening and diagnosis.
format Preprint
id arxiv_https___arxiv_org_abs_2509_20271
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle A Versatile Foundation Model for AI-enabled Mammogram Interpretation
Huang, Fuxiang
Zhu, Jiayi
Yu, Yunfang
Xie, Yu
Guo, Yuan
Kong, Qingcong
Wu, Mingxiang
Jiang, Xinrui
Yang, Shu
Ma, Jiabo
Liu, Ziyi
Xu, Zhe
Chen, Zhixuan
Tan, Yujie
He, Zifan
Mao, Luhui
Wang, Xi
Hou, Junlin
Zhang, Lei
Luo, Qiong
Li, Zhenhui
Yao, Herui
Chen, Hao
Computer Vision and Pattern Recognition
Breast cancer is the most commonly diagnosed cancer and the leading cause of cancer-related mortality in women globally. Mammography is essential for the early detection and diagnosis of breast lesions. Despite recent progress in foundation models (FMs) for mammogram analysis, their clinical translation remains constrained by several fundamental limitations, including insufficient diversity in training data, limited model generalizability, and a lack of comprehensive evaluation across clinically relevant tasks. Here, we introduce VersaMammo, a versatile foundation model for mammograms, designed to overcome these limitations. We curated the largest multi-institutional mammogram dataset to date, comprising 706,239 images from 21 sources. To improve generalization, we propose a two-stage pre-training strategy to develop VersaMammo, a mammogram foundation model. First, a teacher model is trained via self-supervised learning to extract transferable features from unlabeled mammograms. Then, supervised learning combined with knowledge distillation transfers both features and clinical knowledge into VersaMammo. To ensure a comprehensive evaluation, we established a benchmark comprising 92 specific tasks, including 68 internal tasks and 24 external validation tasks, spanning 5 major clinical task categories: lesion detection, segmentation, classification, image retrieval, and visual question answering. VersaMammo achieves state-of-the-art performance, ranking first in 50 out of 68 specific internal tasks and 20 out of 24 external validation tasks, with average ranks of 1.5 and 1.2, respectively. These results demonstrate its superior generalization and clinical utility, offering a substantial advancement toward reliable and scalable breast cancer screening and diagnosis.
title A Versatile Foundation Model for AI-enabled Mammogram Interpretation
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2509.20271