:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Shihao, Zhou, Tiancheng, Ma, Jiatong, Ding, Yanli, Yan, Yiming, Xiao, Ming, Li, Guoyi, Geng, Haiyang, Han, Yunyun, Chen, Jianhua, Deng, Yafeng
Format:	Preprint
Published:	2026
Subjects:	Multiagent Systems Computation and Language
Online Access:	https://arxiv.org/abs/2602.09379
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MIND: Unified Inquiry and Diagnosis RL with Criteria Grounded Clinical Supports for Psychiatric Consultation
by: Li, Guoyi, et al.
Published: (2026)

Empowering Medical Multi-Agents with Clinical Consultation Flow for Dynamic Diagnosis
by: Wang, Sihan, et al.
Published: (2025)

LLM-ABM for Transportation: Assessing the Potential of LLM Agents in System Analysis
by: Liu, Tianming, et al.
Published: (2025)

MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty Consensus
by: Li, Zheng, et al.
Published: (2026)

AgentWebBench: Benchmarking Multi-Agent Coordination in Agentic Web
by: Zhong, Shanshan, et al.
Published: (2026)

Toward LLM-Agent-Based Modeling of Transportation Systems: A Conceptual Framework
by: Liu, Tianming, et al.
Published: (2024)

MedAgentBench: A Realistic Virtual EHR Environment to Benchmark Medical LLM Agents
by: Jiang, Yixing, et al.
Published: (2025)

LongCLI-Bench: A Preliminary Benchmark and Study for Long-horizon Agentic Programming in Command-Line Interfaces
by: Feng, Yukang, et al.
Published: (2026)

Benchmarking LLMs' Swarm intelligence
by: Ruan, Kai, et al.
Published: (2025)

SciReplicate-Bench: Benchmarking LLMs in Agent-driven Algorithmic Reproduction from Research Papers
by: Xiang, Yanzheng, et al.
Published: (2025)

Bridging the Last Mile of Circuit Design: PostEDA-Bench, a Hierarchical Benchmark for PPA Convergence and DRC Fixing
by: Liu, Pengju, et al.
Published: (2026)

CalBench: Evaluating Coordination-Privacy Trade-offs in Multi-Agent LLMs
by: Zou, Chelsea, et al.
Published: (2026)

Crisis-Bench: Benchmarking Strategic Ambiguity and Reputation Management in Large Language Models
by: Lin, Cooper, et al.
Published: (2026)

Multi-Agent Medical Decision Consensus Matrix System: An Intelligent Collaborative Framework for Oncology MDT Consultations
by: Han, Xudong, et al.
Published: (2025)

KramaBench: A Benchmark for AI Systems on Data-to-Insight Pipelines over Data Lakes
by: Lai, Eugenie, et al.
Published: (2025)

BenchMARL: Benchmarking Multi-Agent Reinforcement Learning
by: Bettini, Matteo, et al.
Published: (2023)

AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance
by: Patel, Dhaval, et al.
Published: (2025)

AgentSearchBench: A Benchmark for AI Agent Search in the Wild
by: Wu, Bin, et al.
Published: (2026)

WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting
by: Styles, Olly, et al.
Published: (2024)

Do Mixed-Vendor Multi-Agent LLMs Improve Clinical Diagnosis?
by: Yuan, Grace Chang, et al.
Published: (2026)

MedPriv-Bench: Benchmarking the Privacy-Utility Trade-off of Large Language Models in Medical Open-End Question Answering
by: Guan, Shaowei, et al.
Published: (2026)

How Real Is AI Tutoring? Comparing Simulated and Human Dialogues in One-on-One Instruction
by: Li, Ruijia, et al.
Published: (2025)

CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
by: Siegel, Zachary S., et al.
Published: (2024)

ALAS: Transactional and Dynamic Multi-Agent LLM Planning
by: Geng, Longling, et al.
Published: (2025)

Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence
by: Tihanyi, Norbert, et al.
Published: (2024)

SpecBench: Evaluating Specification-Level Reasoning for Software Engineering LLM Agents
by: Hamblin, Grant, et al.
Published: (2026)

Evaluating Multi-Agent LLM Architectures for Rare Disease Diagnosis
by: Almasoud, Ahmed
Published: (2026)

SAGE: Scalable Agentic Grounded Evaluation for Crop Disease Diagnosis
by: Arshad, Muhammad Arbab, et al.
Published: (2026)

Bench-MFG: A Benchmark Suite for Learning in Stationary Mean Field Games
by: Magnino, Lorenzo, et al.
Published: (2026)

DDO: Dual-Decision Optimization for LLM-Based Medical Consultation via Multi-Agent Collaboration
by: Jia, Zhihao, et al.
Published: (2025)

Stronger-MAS: Multi-Agent Reinforcement Learning for Collaborative LLMs
by: Zhao, Yujie, et al.
Published: (2025)

Multi-Agent Reinforcement Learning for Multi-Cell Spectrum and Power Allocation
by: Zhang, Yiming, et al.
Published: (2023)

Second Order Statistics Analysis and Comparison between Arithmetic and Geometric Average Fusion
by: Li, Tiancheng, et al.
Published: (2019)

Is Your LLM-Based Multi-Agent a Reliable Real-World Planner? Exploring Fraud Detection in Travel Planning
by: Yao, Junchi, et al.
Published: (2025)

Agent-Kernel: A MicroKernel Multi-Agent System Framework for Adaptive Social Simulation Powered by LLMs
by: Mao, Yuren, et al.
Published: (2025)

Self-Organizing Agent Network for LLM-based Workflow Automation
by: Xiong, Yiming, et al.
Published: (2025)

Collaborative QA using Interacting LLMs. Impact of Network Structure, Node Capability and Distributed Data
by: Jain, Adit, et al.
Published: (2025)

Following the TRACE: A Structured Path to Empathetic Response Generation with Multi-Agent Models
by: Liu, Ziqi, et al.
Published: (2025)

Mapis: A Knowledge-Graph Grounded Multi-Agent Framework for Evidence-Based PCOS Diagnosis
by: He, Zanxiang, et al.
Published: (2025)

PromptSculptor: Multi-Agent Based Text-to-Image Prompt Optimization
by: Xiang, Dawei, et al.
Published: (2025)