Salvato in:
Dettagli Bibliografici
Autori principali: Yuan, Grace Chang, Zhang, Xiaoman, Kim, Sung Eun, Rajpurkar, Pranav
Natura: Preprint
Pubblicazione: 2026
Soggetti:
Accesso online:https://arxiv.org/abs/2603.04421
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866912965559582720
author Yuan, Grace Chang
Zhang, Xiaoman
Kim, Sung Eun
Rajpurkar, Pranav
author_facet Yuan, Grace Chang
Zhang, Xiaoman
Kim, Sung Eun
Rajpurkar, Pranav
contents Multi-agent large language model (LLM) systems have emerged as a promising approach for clinical diagnosis, leveraging collaboration among agents to refine medical reasoning. However, most existing frameworks rely on single-vendor teams (e.g., multiple agents from the same model family), which risk correlated failure modes that reinforce shared biases rather than correcting them. We investigate the impact of vendor diversity by comparing Single-LLM, Single-Vendor, and Mixed-Vendor Multi-Agent Conversation (MAC) frameworks. Using three doctor agents instantiated with o4-mini, Gemini-2.5-Pro, and Claude-4.5-Sonnet, we evaluate performance on RareBench and DiagnosisArena. Mixed-vendor configurations consistently outperform single-vendor counterparts, achieving state-of-the-art recall and accuracy. Overlap analysis reveals the underlying mechanism: mixed-vendor teams pool complementary inductive biases, surfacing correct diagnoses that individual models or homogeneous teams collectively miss. These results highlight vendor diversity as a key design principle for robust clinical diagnostic systems.
format Preprint
id arxiv_https___arxiv_org_abs_2603_04421
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Do Mixed-Vendor Multi-Agent LLMs Improve Clinical Diagnosis?
Yuan, Grace Chang
Zhang, Xiaoman
Kim, Sung Eun
Rajpurkar, Pranav
Computation and Language
Artificial Intelligence
Multiagent Systems
Multi-agent large language model (LLM) systems have emerged as a promising approach for clinical diagnosis, leveraging collaboration among agents to refine medical reasoning. However, most existing frameworks rely on single-vendor teams (e.g., multiple agents from the same model family), which risk correlated failure modes that reinforce shared biases rather than correcting them. We investigate the impact of vendor diversity by comparing Single-LLM, Single-Vendor, and Mixed-Vendor Multi-Agent Conversation (MAC) frameworks. Using three doctor agents instantiated with o4-mini, Gemini-2.5-Pro, and Claude-4.5-Sonnet, we evaluate performance on RareBench and DiagnosisArena. Mixed-vendor configurations consistently outperform single-vendor counterparts, achieving state-of-the-art recall and accuracy. Overlap analysis reveals the underlying mechanism: mixed-vendor teams pool complementary inductive biases, surfacing correct diagnoses that individual models or homogeneous teams collectively miss. These results highlight vendor diversity as a key design principle for robust clinical diagnostic systems.
title Do Mixed-Vendor Multi-Agent LLMs Improve Clinical Diagnosis?
topic Computation and Language
Artificial Intelligence
Multiagent Systems
url https://arxiv.org/abs/2603.04421