Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yi, Ziruo, Liu, Jinyu, Xiao, Ting, Albert, Mark V.
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence Information Retrieval
Online Access:	https://arxiv.org/abs/2508.02841
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913975019503616
author	Yi, Ziruo Liu, Jinyu Xiao, Ting Albert, Mark V.
author_facet	Yi, Ziruo Liu, Jinyu Xiao, Ting Albert, Mark V.
contents	Radiology visual question answering (RVQA) provides precise answers to questions about chest X-ray images, alleviating radiologists' workload. While recent methods based on multimodal large language models (MLLMs) and retrieval-augmented generation (RAG) have shown promising progress in RVQA, they still face challenges in factual accuracy, hallucinations, and cross-modal misalignment. We introduce a multi-agent system (MAS) designed to support complex reasoning in RVQA, with specialized agents for context understanding, multimodal reasoning, and answer validation. We evaluate our system on a challenging RVQA set curated via model disagreement filtering, comprising consistently hard cases across multiple MLLMs. Extensive experiments demonstrate the superiority and effectiveness of our system over strong MLLM baselines, with a case study illustrating its reliability and interpretability. This work highlights the potential of multi-agent approaches to support explainable and trustworthy clinical AI applications that require complex reasoning.
format	Preprint
id	arxiv_https___arxiv_org_abs_2508_02841
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	A Multi-Agent System for Complex Reasoning in Radiology Visual Question Answering Yi, Ziruo Liu, Jinyu Xiao, Ting Albert, Mark V. Artificial Intelligence Information Retrieval Radiology visual question answering (RVQA) provides precise answers to questions about chest X-ray images, alleviating radiologists' workload. While recent methods based on multimodal large language models (MLLMs) and retrieval-augmented generation (RAG) have shown promising progress in RVQA, they still face challenges in factual accuracy, hallucinations, and cross-modal misalignment. We introduce a multi-agent system (MAS) designed to support complex reasoning in RVQA, with specialized agents for context understanding, multimodal reasoning, and answer validation. We evaluate our system on a challenging RVQA set curated via model disagreement filtering, comprising consistently hard cases across multiple MLLMs. Extensive experiments demonstrate the superiority and effectiveness of our system over strong MLLM baselines, with a case study illustrating its reliability and interpretability. This work highlights the potential of multi-agent approaches to support explainable and trustworthy clinical AI applications that require complex reasoning.
title	A Multi-Agent System for Complex Reasoning in Radiology Visual Question Answering
topic	Artificial Intelligence Information Retrieval
url	https://arxiv.org/abs/2508.02841

Similar Items