Saved in:
Bibliographic Details
Main Authors: Zhou, Jingqi, Wang, Sheng, Deng, Dezhao, Lu, Junwen, Su, Junwei, Li, Qintong, Gao, Jiahui, Wu, Hao, Jiang, Jiyue, Kong, Lingpeng, Jin, Dunhong, Wu, Chuan
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.07883
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917551704899584
author Zhou, Jingqi
Wang, Sheng
Deng, Dezhao
Lu, Junwen
Su, Junwei
Li, Qintong
Gao, Jiahui
Wu, Hao
Jiang, Jiyue
Kong, Lingpeng
Jin, Dunhong
Wu, Chuan
author_facet Zhou, Jingqi
Wang, Sheng
Deng, Dezhao
Lu, Junwen
Su, Junwei
Li, Qintong
Gao, Jiahui
Wu, Hao
Jiang, Jiyue
Kong, Lingpeng
Jin, Dunhong
Wu, Chuan
contents LLM-powered agentic systems excel at complex long-horizon tasks, but remain constrained by static configurations fixed before execution. Such rigidity forces a trade-off between domain-specific performance and cross-task generalization: strong priors and compact tool spaces aid specialization but weaken transfer, while task-agnostic workflows and broad action spaces expand coverage but dilute guidance. Existing pre-execution optimization, planner-worker orchestration, and configuration patching fall short of resolving this tension, as they decouple adaptation from execution, causing information loss, fragmented optimization, and ambiguous credit assignment. We propose ToolSelf, a tool-driven runtime self-reconfiguration paradigm that abstracts configuration updates as a standardized tool interface and unifies execution and adaptation within one policy's action space. The execution agent can dynamically update sub-goals, strategies, toolboxes, context, and context-management modes based on task progress and feedback. We further introduce Configuration-Aware Two-stage Training (CAT), which combines rejection sampling fine-tuning with trajectory-level KTO reinforcement learning to internalize self-reconfiguration. Across diverse benchmarks, zero-shot ToolSelf rivals task-specialized agents; after CAT training, ToolSelf gains 28.8 points over the static-configuration baseline on average, illuminating a path toward emergent adaptivity that obviates manually injected guidance.
format Preprint
id arxiv_https___arxiv_org_abs_2602_07883
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle ToolSelf: Unifying Task Execution and Self-Reconfiguration via Tool-Driven Emergent Adaptation
Zhou, Jingqi
Wang, Sheng
Deng, Dezhao
Lu, Junwen
Su, Junwei
Li, Qintong
Gao, Jiahui
Wu, Hao
Jiang, Jiyue
Kong, Lingpeng
Jin, Dunhong
Wu, Chuan
Artificial Intelligence
LLM-powered agentic systems excel at complex long-horizon tasks, but remain constrained by static configurations fixed before execution. Such rigidity forces a trade-off between domain-specific performance and cross-task generalization: strong priors and compact tool spaces aid specialization but weaken transfer, while task-agnostic workflows and broad action spaces expand coverage but dilute guidance. Existing pre-execution optimization, planner-worker orchestration, and configuration patching fall short of resolving this tension, as they decouple adaptation from execution, causing information loss, fragmented optimization, and ambiguous credit assignment. We propose ToolSelf, a tool-driven runtime self-reconfiguration paradigm that abstracts configuration updates as a standardized tool interface and unifies execution and adaptation within one policy's action space. The execution agent can dynamically update sub-goals, strategies, toolboxes, context, and context-management modes based on task progress and feedback. We further introduce Configuration-Aware Two-stage Training (CAT), which combines rejection sampling fine-tuning with trajectory-level KTO reinforcement learning to internalize self-reconfiguration. Across diverse benchmarks, zero-shot ToolSelf rivals task-specialized agents; after CAT training, ToolSelf gains 28.8 points over the static-configuration baseline on average, illuminating a path toward emergent adaptivity that obviates manually injected guidance.
title ToolSelf: Unifying Task Execution and Self-Reconfiguration via Tool-Driven Emergent Adaptation
topic Artificial Intelligence
url https://arxiv.org/abs/2602.07883