Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Zhang, Yuxiang, Yang, Yuqi, Shu, Jiangming, Wen, Xinyan, Sang, Jitao
Format:	Preprint
Publié:	2025
Sujets:	Artificial Intelligence
Accès en ligne:	https://arxiv.org/abs/2503.06580
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866913727212683264
author	Zhang, Yuxiang Yang, Yuqi Shu, Jiangming Wen, Xinyan Sang, Jitao
author_facet	Zhang, Yuxiang Yang, Yuqi Shu, Jiangming Wen, Xinyan Sang, Jitao
contents	Traditional agentic workflows rely on external prompts to manage interactions with tools and the environment, which limits the autonomy of reasoning models. We position \emph{Large Agent Models (LAMs)} that internalize the generation of \emph{Chain-of-Action (CoA)}, enabling the model to autonomously decide when and how to use external tools. Our proposed AutoCoA framework combines supervised fine-tuning (SFT) and reinforcement learning (RL), allowing the model to seamlessly switch between reasoning and action while efficiently managing environment interactions. Main components include step-level action triggering, trajectory-level CoA optimization, and an internal world model to reduce real-environment interaction costs. Evaluations on open-domain QA tasks demonstrate that AutoCoA-trained agent models significantly outperform ReAct-based workflows in task completion, especially in tasks that require long-term reasoning and multi-step actions. Code and dataset are available at https://github.com/ADaM-BJTU/AutoCoA
format	Preprint
id	arxiv_https___arxiv_org_abs_2503_06580
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Agent models: Internalizing Chain-of-Action Generation into Reasoning models Zhang, Yuxiang Yang, Yuqi Shu, Jiangming Wen, Xinyan Sang, Jitao Artificial Intelligence Traditional agentic workflows rely on external prompts to manage interactions with tools and the environment, which limits the autonomy of reasoning models. We position \emph{Large Agent Models (LAMs)} that internalize the generation of \emph{Chain-of-Action (CoA)}, enabling the model to autonomously decide when and how to use external tools. Our proposed AutoCoA framework combines supervised fine-tuning (SFT) and reinforcement learning (RL), allowing the model to seamlessly switch between reasoning and action while efficiently managing environment interactions. Main components include step-level action triggering, trajectory-level CoA optimization, and an internal world model to reduce real-environment interaction costs. Evaluations on open-domain QA tasks demonstrate that AutoCoA-trained agent models significantly outperform ReAct-based workflows in task completion, especially in tasks that require long-term reasoning and multi-step actions. Code and dataset are available at https://github.com/ADaM-BJTU/AutoCoA
title	Agent models: Internalizing Chain-of-Action Generation into Reasoning models
topic	Artificial Intelligence
url	https://arxiv.org/abs/2503.06580

Documents similaires