Saved in:
Bibliographic Details
Main Authors: Zhang, Yuxiang, Yang, Yuqi, Shu, Jiangming, Wen, Xinyan, Sang, Jitao
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.06580
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Traditional agentic workflows rely on external prompts to manage interactions with tools and the environment, which limits the autonomy of reasoning models. We position \emph{Large Agent Models (LAMs)} that internalize the generation of \emph{Chain-of-Action (CoA)}, enabling the model to autonomously decide when and how to use external tools. Our proposed AutoCoA framework combines supervised fine-tuning (SFT) and reinforcement learning (RL), allowing the model to seamlessly switch between reasoning and action while efficiently managing environment interactions. Main components include step-level action triggering, trajectory-level CoA optimization, and an internal world model to reduce real-environment interaction costs. Evaluations on open-domain QA tasks demonstrate that AutoCoA-trained agent models significantly outperform ReAct-based workflows in task completion, especially in tasks that require long-term reasoning and multi-step actions. Code and dataset are available at https://github.com/ADaM-BJTU/AutoCoA