Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhu, Jiayi, Huang, Fuxiang, Xie, Yu, Wang, Xi, Chen, Zhixuan, Guo, Yuan, Kong, Qingcong, Li, Zhenhui, Luo, Qiong, Chen, Hao
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2605.31093
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918531346464768
author	Zhu, Jiayi Huang, Fuxiang Xie, Yu Wang, Xi Chen, Zhixuan Guo, Yuan Kong, Qingcong Li, Zhenhui Luo, Qiong Chen, Hao
author_facet	Zhu, Jiayi Huang, Fuxiang Xie, Yu Wang, Xi Chen, Zhixuan Guo, Yuan Kong, Qingcong Li, Zhenhui Luo, Qiong Chen, Hao
contents	Breast cancer is a major global health concern, and mammography screening plays a central role in early detection. The large volume of screening examinations creates a substantial workload for radiologists, making accurate and consistent report generation a critical clinical challenge. Existing automated mammography report generation methods primarily focus on direct visual-to-text mapping, while overlooking the structured clinical reasoning process followed by radiologists in real-world practice. To address this limitation, we propose MammoRG, a mammography report generation framework that explicitly simulates the clinical reporting workflow by following the BI-RADS guideline and incorporating prior clinical knowledge to produce diagnostic reports. Specifically, MammoRG adopts a two-stage training framework. In the first stage, the model learns to integrate clinically relevant prior knowledge from a patient's four-view mammograms through classification-based supervision. In the second stage, a terminology-aware supervised fine-tuning strategy is introduced to model mammography-specific clinical terms as atomic semantic units, enabling the generation of high-quality reports with improved clinical consistency. To facilitate clinical efficacy evaluation of generated reports, we further develop MammoRGTool, a dedicated mammography report parsing tool that extracts structured clinical information from free-text reports. Extensive experiments demonstrate that MammoRG consistently outperforms existing methods across multiple clinical efficacy metrics, particularly in diagnosis-related BI-RADS F1, where it surpasses the second-best model by 2.73%, 2.04%, 1.90%, and 3.27% on the internal, external 1, external 2, and VinDr-Mammo datasets, respectively.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_31093
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Cross-Modal Clinical Knowledge Integration for Mammography Report Generation Zhu, Jiayi Huang, Fuxiang Xie, Yu Wang, Xi Chen, Zhixuan Guo, Yuan Kong, Qingcong Li, Zhenhui Luo, Qiong Chen, Hao Computer Vision and Pattern Recognition Breast cancer is a major global health concern, and mammography screening plays a central role in early detection. The large volume of screening examinations creates a substantial workload for radiologists, making accurate and consistent report generation a critical clinical challenge. Existing automated mammography report generation methods primarily focus on direct visual-to-text mapping, while overlooking the structured clinical reasoning process followed by radiologists in real-world practice. To address this limitation, we propose MammoRG, a mammography report generation framework that explicitly simulates the clinical reporting workflow by following the BI-RADS guideline and incorporating prior clinical knowledge to produce diagnostic reports. Specifically, MammoRG adopts a two-stage training framework. In the first stage, the model learns to integrate clinically relevant prior knowledge from a patient's four-view mammograms through classification-based supervision. In the second stage, a terminology-aware supervised fine-tuning strategy is introduced to model mammography-specific clinical terms as atomic semantic units, enabling the generation of high-quality reports with improved clinical consistency. To facilitate clinical efficacy evaluation of generated reports, we further develop MammoRGTool, a dedicated mammography report parsing tool that extracts structured clinical information from free-text reports. Extensive experiments demonstrate that MammoRG consistently outperforms existing methods across multiple clinical efficacy metrics, particularly in diagnosis-related BI-RADS F1, where it surpasses the second-best model by 2.73%, 2.04%, 1.90%, and 3.27% on the internal, external 1, external 2, and VinDr-Mammo datasets, respectively.
title	Cross-Modal Clinical Knowledge Integration for Mammography Report Generation
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2605.31093

Similar Items