:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Cifani, Susanna, Bernardi, Mario Luca, Cimitile, Marta
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2605.28607
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Improving Hospital Process Management through Process Mining: A Case Study on COVID-19 Clinical Pathways
by: Ardimento, Pasquale, et al.
Published: (2026)

Adversarial Humanities Benchmark: Results on Stylistic Robustness in Frontier Model Safety
by: Galisai, Marcello, et al.
Published: (2026)

FlowMind: Automatic Workflow Generation with LLMs
by: Zeng, Zhen, et al.
Published: (2024)

EduAgentQG: A Multi-Agent Workflow Framework for Personalized Question Generation
by: Jia, Rui, et al.
Published: (2025)

Case-Based Calibration of Adaptive Reasoning and Execution for LLM Tool Use
by: Pang, Renning, et al.
Published: (2026)

InteractWeb-Bench: Can Multimodal Agent Escape Blind Execution in Interactive Website Generation?
by: Wang, Qiyao, et al.
Published: (2026)

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
by: Cao, Ruisheng, et al.
Published: (2024)

MEDDxAgent: A Unified Modular Agent Framework for Explainable Automatic Differential Diagnosis
by: Rose, Daniel, et al.
Published: (2025)

Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs
by: Chen, Yurun, et al.
Published: (2025)

Benchmarking Real-Time Question Answering via Executable Code Workflows
by: Zhou, Wenjie, et al.
Published: (2026)

AutoPatent: A Multi-Agent Framework for Automatic Patent Generation
by: Wang, Qiyao, et al.
Published: (2024)

TraceSIR: A Multi-Agent Framework for Structured Analysis and Reporting of Agentic Execution Traces
by: Yang, Shu-Xun, et al.
Published: (2026)

Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution
by: Qin, Tianrui, et al.
Published: (2025)

Towards Automatic Continual Learning: A Self-Adaptive Framework for Continual Instruction Tuning
by: Lin, Peiyi, et al.
Published: (2025)

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
by: Sun, Qiushi, et al.
Published: (2025)

ARCANE: A Multi-Agent Framework for Interpretable and Configurable Alignment
by: Masters, Charlie, et al.
Published: (2025)

AgentCompass: Towards Reliable Evaluation of Agentic Workflows in Production
by: Kartik, NVJK, et al.
Published: (2025)

ScribeAgent: Towards Specialized Web Agents Using Production-Scale Workflow Data
by: Shen, Junhong, et al.
Published: (2024)

Executable Code Actions Elicit Better LLM Agents
by: Wang, Xingyao, et al.
Published: (2024)

AgentXRay: White-Boxing Agentic Systems via Workflow Reconstruction
by: Shi, Ruijie, et al.
Published: (2026)

SkillOpt: Executive Strategy for Self-Evolving Agent Skills
by: Yang, Yifan, et al.
Published: (2026)

Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning
by: Zhou, Hang, et al.
Published: (2024)

StorageXTuner: An LLM Agent-Driven Automatic Tuning Framework for Heterogeneous Storage Systems
by: Lin, Qi, et al.
Published: (2025)

AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents
by: Gioacchini, Luca, et al.
Published: (2024)

L-MARS: Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search
by: Wang, Ziqi, et al.
Published: (2025)

DatawiseAgent: A Notebook-Centric LLM Agent Framework for Adaptive and Robust Data Science Automation
by: You, Ziming, et al.
Published: (2025)

Towards Automated Patent Workflows: AI-Orchestrated Multi-Agent Framework for Intellectual Property Management and Analysis
by: Srinivas, Sakhinana Sagar, et al.
Published: (2024)

Automatic Prompt Generation via Adaptive Selection of Prompting Techniques
by: Ikenoue, Yohei, et al.
Published: (2025)

Evolving and Executing Research Plans via Double-Loop Multi-Agent Collaboration
by: Zhang, Zhi, et al.
Published: (2025)

CAFES: A Collaborative Multi-Agent Framework for Multi-Granular Multimodal Essay Scoring
by: Su, Jiamin, et al.
Published: (2025)

ACE-$M^3$: Automatic Capability Evaluator for Multimodal Medical Models
by: Zhang, Xiechi, et al.
Published: (2024)

DOCE: Finding the Sweet Spot for Execution-Based Code Generation
by: Li, Haau-Sing, et al.
Published: (2024)

From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents
by: Yue, Ling, et al.
Published: (2026)

AdaMARP: An Adaptive Multi-Agent Interaction Framework for General Immersive Role-Playing
by: Xu, Zhenhua, et al.
Published: (2026)

$\texttt{YC-Bench}$: Benchmarking AI Agents for Long-Term Planning and Consistent Execution
by: He, Muyu, et al.
Published: (2026)

When Only the Final Text Survives: Implicit Execution Tracing for Multi-Agent Attribution
by: Nian, Yi, et al.
Published: (2026)

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution
by: Li, Junlong, et al.
Published: (2025)

Governed Memory: A Production Architecture for Multi-Agent Workflows
by: Taheri, Hamed
Published: (2026)

A Multimodal Social Agent
by: Bikaki, Athina, et al.
Published: (2024)

AI Planning Framework for LLM-Based Web Agents
by: Shahnovsky, Orit, et al.
Published: (2026)