Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yi, Ziruo, Xiao, Ting, Albert, Mark V.
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2505.09787
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912376587026432
author	Yi, Ziruo Xiao, Ting Albert, Mark V.
author_facet	Yi, Ziruo Xiao, Ting Albert, Mark V.
contents	Radiology report generation (RRG) aims to automatically produce diagnostic reports from medical images, with the potential to enhance clinical workflows and reduce radiologists' workload. While recent approaches leveraging multimodal large language models (MLLMs) and retrieval-augmented generation (RAG) have achieved strong results, they continue to face challenges such as factual inconsistency, hallucination, and cross-modal misalignment. We propose a multimodal multi-agent framework for RRG that aligns with the stepwise clinical reasoning workflow, where task-specific agents handle retrieval, draft generation, visual analysis, refinement, and synthesis. Experimental results demonstrate that our approach outperforms a strong baseline in both automatic metrics and LLM-based evaluations, producing more accurate, structured, and interpretable reports. This work highlights the potential of clinically aligned multi-agent frameworks to support explainable and trustworthy clinical AI applications.
format	Preprint
id	arxiv_https___arxiv_org_abs_2505_09787
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	A Multimodal Multi-Agent Framework for Radiology Report Generation Yi, Ziruo Xiao, Ting Albert, Mark V. Artificial Intelligence Radiology report generation (RRG) aims to automatically produce diagnostic reports from medical images, with the potential to enhance clinical workflows and reduce radiologists' workload. While recent approaches leveraging multimodal large language models (MLLMs) and retrieval-augmented generation (RAG) have achieved strong results, they continue to face challenges such as factual inconsistency, hallucination, and cross-modal misalignment. We propose a multimodal multi-agent framework for RRG that aligns with the stepwise clinical reasoning workflow, where task-specific agents handle retrieval, draft generation, visual analysis, refinement, and synthesis. Experimental results demonstrate that our approach outperforms a strong baseline in both automatic metrics and LLM-based evaluations, producing more accurate, structured, and interpretable reports. This work highlights the potential of clinically aligned multi-agent frameworks to support explainable and trustworthy clinical AI applications.
title	A Multimodal Multi-Agent Framework for Radiology Report Generation
topic	Artificial Intelligence
url	https://arxiv.org/abs/2505.09787

Similar Items