Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Gao, Hongcheng, Liu, Yue, He, Yufei, Dou, Longxu, Du, Chao, Deng, Zhijie, Hooi, Bryan, Lin, Min, Pang, Tianyu
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2504.15257
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916699780939776
author	Gao, Hongcheng Liu, Yue He, Yufei Dou, Longxu Du, Chao Deng, Zhijie Hooi, Bryan Lin, Min Pang, Tianyu
author_facet	Gao, Hongcheng Liu, Yue He, Yufei Dou, Longxu Du, Chao Deng, Zhijie Hooi, Bryan Lin, Min Pang, Tianyu
contents	This paper proposes a query-level meta-agent named FlowReasoner to automate the design of query-level multi-agent systems, i.e., one system per user query. Our core idea is to incentivize a reasoning-based meta-agent via external execution feedback. Concretely, by distilling DeepSeek R1, we first endow the basic reasoning ability regarding the generation of multi-agent systems to FlowReasoner. Then, we further enhance it via reinforcement learning (RL) with external execution feedback. A multi-purpose reward is designed to guide the RL training from aspects of performance, complexity, and efficiency. In this manner, FlowReasoner is enabled to generate a personalized multi-agent system for each user query via deliberative reasoning. Experiments on both engineering and competition code benchmarks demonstrate the superiority of FlowReasoner. Remarkably, it surpasses o1-mini by 10.52% accuracy across three benchmarks. The code is available at https://github.com/sail-sg/FlowReasoner.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_15257
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	FlowReasoner: Reinforcing Query-Level Meta-Agents Gao, Hongcheng Liu, Yue He, Yufei Dou, Longxu Du, Chao Deng, Zhijie Hooi, Bryan Lin, Min Pang, Tianyu Artificial Intelligence This paper proposes a query-level meta-agent named FlowReasoner to automate the design of query-level multi-agent systems, i.e., one system per user query. Our core idea is to incentivize a reasoning-based meta-agent via external execution feedback. Concretely, by distilling DeepSeek R1, we first endow the basic reasoning ability regarding the generation of multi-agent systems to FlowReasoner. Then, we further enhance it via reinforcement learning (RL) with external execution feedback. A multi-purpose reward is designed to guide the RL training from aspects of performance, complexity, and efficiency. In this manner, FlowReasoner is enabled to generate a personalized multi-agent system for each user query via deliberative reasoning. Experiments on both engineering and competition code benchmarks demonstrate the superiority of FlowReasoner. Remarkably, it surpasses o1-mini by 10.52% accuracy across three benchmarks. The code is available at https://github.com/sail-sg/FlowReasoner.
title	FlowReasoner: Reinforcing Query-Level Meta-Agents
topic	Artificial Intelligence
url	https://arxiv.org/abs/2504.15257

Similar Items