Saved in:
Bibliographic Details
Main Authors: Yuan, Guozhi, Liu, Youfeng, Yang, Jingli, Jia, Wei, Lin, Kai, Gao, Yansong, He, Shan, Ding, Zilin, Li, Haitao
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.07054
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929673806544896
author Yuan, Guozhi
Liu, Youfeng
Yang, Jingli
Jia, Wei
Lin, Kai
Gao, Yansong
He, Shan
Ding, Zilin
Li, Haitao
author_facet Yuan, Guozhi
Liu, Youfeng
Yang, Jingli
Jia, Wei
Lin, Kai
Gao, Yansong
He, Shan
Ding, Zilin
Li, Haitao
contents Based on their superior comprehension and reasoning capabilities, Large Language Model (LLM) driven agent frameworks have achieved significant success in numerous complex reasoning tasks. ReAct-like agents can solve various intricate problems step-by-step through progressive planning and tool calls, iteratively optimizing new steps based on environmental feedback. However, as the planning capabilities of LLMs improve, the actions invoked by tool calls in ReAct-like frameworks often misalign with complex planning and challenging data organization. Code Action addresses these issues while also introducing the challenges of a more complex action space and more difficult action organization. To leverage Code Action and tackle the challenges of its complexity, this paper proposes Policy and Action Dual-Control Agent (PoAct) for generalized applications. The aim is to achieve higher-quality code actions and more accurate reasoning paths by dynamically switching reasoning policies and modifying the action space. Experimental results on the Agent Benchmark for both legal and generic scenarios demonstrate the superior reasoning capabilities and reduced token consumption of our approach in complex tasks. On the LegalAgentBench, our method shows a 20 percent improvement over the baseline while requiring fewer tokens. We conducted experiments and analyses on the GPT-4o and GLM-4 series models, demonstrating the significant potential and scalability of our approach to solve complex problems.
format Preprint
id arxiv_https___arxiv_org_abs_2501_07054
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle PoAct: Policy and Action Dual-Control Agent for Generalized Applications
Yuan, Guozhi
Liu, Youfeng
Yang, Jingli
Jia, Wei
Lin, Kai
Gao, Yansong
He, Shan
Ding, Zilin
Li, Haitao
Artificial Intelligence
Based on their superior comprehension and reasoning capabilities, Large Language Model (LLM) driven agent frameworks have achieved significant success in numerous complex reasoning tasks. ReAct-like agents can solve various intricate problems step-by-step through progressive planning and tool calls, iteratively optimizing new steps based on environmental feedback. However, as the planning capabilities of LLMs improve, the actions invoked by tool calls in ReAct-like frameworks often misalign with complex planning and challenging data organization. Code Action addresses these issues while also introducing the challenges of a more complex action space and more difficult action organization. To leverage Code Action and tackle the challenges of its complexity, this paper proposes Policy and Action Dual-Control Agent (PoAct) for generalized applications. The aim is to achieve higher-quality code actions and more accurate reasoning paths by dynamically switching reasoning policies and modifying the action space. Experimental results on the Agent Benchmark for both legal and generic scenarios demonstrate the superior reasoning capabilities and reduced token consumption of our approach in complex tasks. On the LegalAgentBench, our method shows a 20 percent improvement over the baseline while requiring fewer tokens. We conducted experiments and analyses on the GPT-4o and GLM-4 series models, demonstrating the significant potential and scalability of our approach to solve complex problems.
title PoAct: Policy and Action Dual-Control Agent for Generalized Applications
topic Artificial Intelligence
url https://arxiv.org/abs/2501.07054