Saved in:
| Main Authors: | Orimo, Yuki, Kurata, Iori, Mori, Hodaka, Okuno, Ryuhei, Sawada, Ryohto, Okanohara, Daisuke |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.03549 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
When Robots Do the Chores: A Benchmark and Agent for Long-Horizon Household Task Execution
by: Zhu, Zilin, et al.
Published: (2026)
by: Zhu, Zilin, et al.
Published: (2026)
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution
by: Li, Junlong, et al.
Published: (2025)
by: Li, Junlong, et al.
Published: (2025)
SAGE: Scene Graph-Aware Guidance and Execution for Long-Horizon Manipulation Tasks
by: Li, Jialiang, et al.
Published: (2025)
by: Li, Jialiang, et al.
Published: (2025)
Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks
by: Yang, Cheng, et al.
Published: (2025)
by: Yang, Cheng, et al.
Published: (2025)
Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks
by: Zhang, Yuxiang, et al.
Published: (2025)
by: Zhang, Yuxiang, et al.
Published: (2025)
NEMO: Execution-Aware Optimization Modeling via Autonomous Coding Agents
by: Song, Yang, et al.
Published: (2026)
by: Song, Yang, et al.
Published: (2026)
Learning Agent-Compatible Context Management for Long-Horizon Tasks
by: Yi, Lu, et al.
Published: (2026)
by: Yi, Lu, et al.
Published: (2026)
SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks
by: Orlanski, Gabriel, et al.
Published: (2026)
by: Orlanski, Gabriel, et al.
Published: (2026)
REMAC: Self-Reflective and Self-Evolving Multi-Agent Collaboration for Long-Horizon Robot Manipulation
by: Yuan, Puzhen, et al.
Published: (2025)
by: Yuan, Puzhen, et al.
Published: (2025)
Beyond Entangled Planning: Task-Decoupled Planning for Long-Horizon Agents
by: Li, Yunfan, et al.
Published: (2026)
by: Li, Yunfan, et al.
Published: (2026)
GTA: Generating Long-Horizon Tasks for Web Agents at Scale
by: Huang, Tenghao, et al.
Published: (2026)
by: Huang, Tenghao, et al.
Published: (2026)
ARC: Active and Reflection-driven Context Management for Long-Horizon Information Seeking Agents
by: Yao, Yilun, et al.
Published: (2026)
by: Yao, Yilun, et al.
Published: (2026)
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
by: Sinha, Akshit, et al.
Published: (2025)
by: Sinha, Akshit, et al.
Published: (2025)
OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution
by: Zhang, Le, et al.
Published: (2026)
by: Zhang, Le, et al.
Published: (2026)
AgentFugue: Agent Scaling for Long-Horizon Tasks through Collective Reasoning
by: Hu, Yuyang, et al.
Published: (2026)
by: Hu, Yuyang, et al.
Published: (2026)
ELHPlan: Efficient Long-Horizon Task Planning for Multi-Agent Collaboration
by: Ling, Shaobin, et al.
Published: (2025)
by: Ling, Shaobin, et al.
Published: (2025)
FCRF: Flexible Constructivism Reflection for Long-Horizon Robotic Task Planning with Large Language Models
by: Song, Yufan, et al.
Published: (2025)
by: Song, Yufan, et al.
Published: (2025)
Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks
by: Wu, Xiyang, et al.
Published: (2026)
by: Wu, Xiyang, et al.
Published: (2026)
MemPO: Self-Memory Policy Optimization for Long-Horizon Agents
by: Li, Ruoran, et al.
Published: (2026)
by: Li, Ruoran, et al.
Published: (2026)
Orchestrator: Active Inference for Multi-Agent Systems in Long-Horizon Tasks
by: Beckenbauer, Lukas, et al.
Published: (2025)
by: Beckenbauer, Lukas, et al.
Published: (2025)
The Illusion of Procedural Reasoning: Measuring Long-Horizon FSM Execution in LLMs
by: Samiei, Mahdi, et al.
Published: (2025)
by: Samiei, Mahdi, et al.
Published: (2025)
"DIVE" into Hydrogen Storage Materials Discovery with AI Agents
by: Zhang, Di, et al.
Published: (2025)
by: Zhang, Di, et al.
Published: (2025)
ContextFlow: Hierarchical Task-State Alignment for Long-Horizon Embodied Agents
by: Guo, Shuhan, et al.
Published: (2026)
by: Guo, Shuhan, et al.
Published: (2026)
Fault-Tolerant Sandboxing for AI Coding Agents: A Transactional Approach to Safe Autonomous Execution
by: Yan, Boyang
Published: (2025)
by: Yan, Boyang
Published: (2025)
Heterogeneous Multi-Expert Reinforcement Learning for Long-Horizon Multi-Goal Tasks in Autonomous Forklifts
by: Chen, Yun, et al.
Published: (2026)
by: Chen, Yun, et al.
Published: (2026)
Tree-of-Code: A Hybrid Approach for Robust Complex Task Planning and Execution
by: Ni, Ziyi, et al.
Published: (2024)
by: Ni, Ziyi, et al.
Published: (2024)
Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks
by: Kar, Indrajit, et al.
Published: (2025)
by: Kar, Indrajit, et al.
Published: (2025)
LH-Bench: Skill-Grounded Evaluation of Long-Horizon Agents on Subjective Enterprise Tasks
by: Chandwani, Abhishek, et al.
Published: (2026)
by: Chandwani, Abhishek, et al.
Published: (2026)
STMA: A Spatio-Temporal Memory Agent for Long-Horizon Embodied Task Planning
by: Lei, Mingcong, et al.
Published: (2025)
by: Lei, Mingcong, et al.
Published: (2025)
WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration
by: Zhang, Yao, et al.
Published: (2024)
by: Zhang, Yao, et al.
Published: (2024)
Long-Horizon Visual Imitation Learning via Plan and Code Reflection
by: Chen, Quan, et al.
Published: (2025)
by: Chen, Quan, et al.
Published: (2025)
Intrinsic Stability Limits of Autoregressive Reasoning: Structural Consequences for Long-Horizon Execution
by: Liao, Hsien-Jyh
Published: (2026)
by: Liao, Hsien-Jyh
Published: (2026)
When the Specification Emerges: Benchmarking Faithfulness Loss in Long-Horizon Coding Agents
by: Yan, Lu, et al.
Published: (2026)
by: Yan, Lu, et al.
Published: (2026)
The Conversations Beneath the Code: Triadic Data for Long-Horizon Software Engineering Agents
by: Kim, Yelin
Published: (2026)
by: Kim, Yelin
Published: (2026)
A Thermodynamic Theory of Learning Part II: Critical Period Closure and Continual Learning Failure
by: Okanohara, Daisuke
Published: (2026)
by: Okanohara, Daisuke
Published: (2026)
A Thermodynamic Theory of Learning I: Irreversible Ensemble Transport and Epistemic Costs
by: Okanohara, Daisuke
Published: (2026)
by: Okanohara, Daisuke
Published: (2026)
Interaction as Intelligence Part II: Asynchronous Human-Agent Rollout for Long-Horizon Task Training
by: Fu, Dayuan, et al.
Published: (2025)
by: Fu, Dayuan, et al.
Published: (2025)
SpecBench: Measuring Reward Hacking in Long-Horizon Coding Agents
by: Zhao, Bingchen, et al.
Published: (2026)
by: Zhao, Bingchen, et al.
Published: (2026)
InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery
by: Feng, Shiyang, et al.
Published: (2026)
by: Feng, Shiyang, et al.
Published: (2026)
STRUCTUREDAGENT: Planning with AND/OR Trees for Long-Horizon Web Tasks
by: Lobo, ELita, et al.
Published: (2026)
by: Lobo, ELita, et al.
Published: (2026)
Similar Items
-
When Robots Do the Chores: A Benchmark and Agent for Long-Horizon Household Task Execution
by: Zhu, Zilin, et al.
Published: (2026) -
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution
by: Li, Junlong, et al.
Published: (2025) -
SAGE: Scene Graph-Aware Guidance and Execution for Long-Horizon Manipulation Tasks
by: Li, Jialiang, et al.
Published: (2025) -
Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks
by: Yang, Cheng, et al.
Published: (2025) -
Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks
by: Zhang, Yuxiang, et al.
Published: (2025)