Saved in:
| Main Authors: | , , , , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.07883 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866917551704899584 |
|---|---|
| author | Zhou, Jingqi Wang, Sheng Deng, Dezhao Lu, Junwen Su, Junwei Li, Qintong Gao, Jiahui Wu, Hao Jiang, Jiyue Kong, Lingpeng Jin, Dunhong Wu, Chuan |
| author_facet | Zhou, Jingqi Wang, Sheng Deng, Dezhao Lu, Junwen Su, Junwei Li, Qintong Gao, Jiahui Wu, Hao Jiang, Jiyue Kong, Lingpeng Jin, Dunhong Wu, Chuan |
| contents | LLM-powered agentic systems excel at complex long-horizon tasks, but remain constrained by static configurations fixed before execution. Such rigidity forces a trade-off between domain-specific performance and cross-task generalization: strong priors and compact tool spaces aid specialization but weaken transfer, while task-agnostic workflows and broad action spaces expand coverage but dilute guidance. Existing pre-execution optimization, planner-worker orchestration, and configuration patching fall short of resolving this tension, as they decouple adaptation from execution, causing information loss, fragmented optimization, and ambiguous credit assignment. We propose ToolSelf, a tool-driven runtime self-reconfiguration paradigm that abstracts configuration updates as a standardized tool interface and unifies execution and adaptation within one policy's action space. The execution agent can dynamically update sub-goals, strategies, toolboxes, context, and context-management modes based on task progress and feedback. We further introduce Configuration-Aware Two-stage Training (CAT), which combines rejection sampling fine-tuning with trajectory-level KTO reinforcement learning to internalize self-reconfiguration. Across diverse benchmarks, zero-shot ToolSelf rivals task-specialized agents; after CAT training, ToolSelf gains 28.8 points over the static-configuration baseline on average, illuminating a path toward emergent adaptivity that obviates manually injected guidance. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2602_07883 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | ToolSelf: Unifying Task Execution and Self-Reconfiguration via Tool-Driven Emergent Adaptation Zhou, Jingqi Wang, Sheng Deng, Dezhao Lu, Junwen Su, Junwei Li, Qintong Gao, Jiahui Wu, Hao Jiang, Jiyue Kong, Lingpeng Jin, Dunhong Wu, Chuan Artificial Intelligence LLM-powered agentic systems excel at complex long-horizon tasks, but remain constrained by static configurations fixed before execution. Such rigidity forces a trade-off between domain-specific performance and cross-task generalization: strong priors and compact tool spaces aid specialization but weaken transfer, while task-agnostic workflows and broad action spaces expand coverage but dilute guidance. Existing pre-execution optimization, planner-worker orchestration, and configuration patching fall short of resolving this tension, as they decouple adaptation from execution, causing information loss, fragmented optimization, and ambiguous credit assignment. We propose ToolSelf, a tool-driven runtime self-reconfiguration paradigm that abstracts configuration updates as a standardized tool interface and unifies execution and adaptation within one policy's action space. The execution agent can dynamically update sub-goals, strategies, toolboxes, context, and context-management modes based on task progress and feedback. We further introduce Configuration-Aware Two-stage Training (CAT), which combines rejection sampling fine-tuning with trajectory-level KTO reinforcement learning to internalize self-reconfiguration. Across diverse benchmarks, zero-shot ToolSelf rivals task-specialized agents; after CAT training, ToolSelf gains 28.8 points over the static-configuration baseline on average, illuminating a path toward emergent adaptivity that obviates manually injected guidance. |
| title | ToolSelf: Unifying Task Execution and Self-Reconfiguration via Tool-Driven Emergent Adaptation |
| topic | Artificial Intelligence |
| url | https://arxiv.org/abs/2602.07883 |