:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Orimo, Yuki, Kurata, Iori, Mori, Hodaka, Okuno, Ryuhei, Sawada, Ryohto, Okanohara, Daisuke
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2512.03549
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

When Robots Do the Chores: A Benchmark and Agent for Long-Horizon Household Task Execution
by: Zhu, Zilin, et al.
Published: (2026)

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution
by: Li, Junlong, et al.
Published: (2025)

SAGE: Scene Graph-Aware Guidance and Execution for Long-Horizon Manipulation Tasks
by: Li, Jialiang, et al.
Published: (2025)

Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks
by: Yang, Cheng, et al.
Published: (2025)

Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks
by: Zhang, Yuxiang, et al.
Published: (2025)

NEMO: Execution-Aware Optimization Modeling via Autonomous Coding Agents
by: Song, Yang, et al.
Published: (2026)

Learning Agent-Compatible Context Management for Long-Horizon Tasks
by: Yi, Lu, et al.
Published: (2026)

SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks
by: Orlanski, Gabriel, et al.
Published: (2026)

REMAC: Self-Reflective and Self-Evolving Multi-Agent Collaboration for Long-Horizon Robot Manipulation
by: Yuan, Puzhen, et al.
Published: (2025)

Beyond Entangled Planning: Task-Decoupled Planning for Long-Horizon Agents
by: Li, Yunfan, et al.
Published: (2026)

GTA: Generating Long-Horizon Tasks for Web Agents at Scale
by: Huang, Tenghao, et al.
Published: (2026)

ARC: Active and Reflection-driven Context Management for Long-Horizon Information Seeking Agents
by: Yao, Yilun, et al.
Published: (2026)

The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
by: Sinha, Akshit, et al.
Published: (2025)

OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution
by: Zhang, Le, et al.
Published: (2026)

AgentFugue: Agent Scaling for Long-Horizon Tasks through Collective Reasoning
by: Hu, Yuyang, et al.
Published: (2026)

ELHPlan: Efficient Long-Horizon Task Planning for Multi-Agent Collaboration
by: Ling, Shaobin, et al.
Published: (2025)

FCRF: Flexible Constructivism Reflection for Long-Horizon Robotic Task Planning with Large Language Models
by: Song, Yufan, et al.
Published: (2025)

Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks
by: Wu, Xiyang, et al.
Published: (2026)

MemPO: Self-Memory Policy Optimization for Long-Horizon Agents
by: Li, Ruoran, et al.
Published: (2026)

Orchestrator: Active Inference for Multi-Agent Systems in Long-Horizon Tasks
by: Beckenbauer, Lukas, et al.
Published: (2025)

The Illusion of Procedural Reasoning: Measuring Long-Horizon FSM Execution in LLMs
by: Samiei, Mahdi, et al.
Published: (2025)

"DIVE" into Hydrogen Storage Materials Discovery with AI Agents
by: Zhang, Di, et al.
Published: (2025)

ContextFlow: Hierarchical Task-State Alignment for Long-Horizon Embodied Agents
by: Guo, Shuhan, et al.
Published: (2026)

Fault-Tolerant Sandboxing for AI Coding Agents: A Transactional Approach to Safe Autonomous Execution
by: Yan, Boyang
Published: (2025)

Heterogeneous Multi-Expert Reinforcement Learning for Long-Horizon Multi-Goal Tasks in Autonomous Forklifts
by: Chen, Yun, et al.
Published: (2026)

Tree-of-Code: A Hybrid Approach for Robust Complex Task Planning and Execution
by: Ni, Ziyi, et al.
Published: (2024)

Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks
by: Kar, Indrajit, et al.
Published: (2025)

LH-Bench: Skill-Grounded Evaluation of Long-Horizon Agents on Subjective Enterprise Tasks
by: Chandwani, Abhishek, et al.
Published: (2026)

STMA: A Spatio-Temporal Memory Agent for Long-Horizon Embodied Task Planning
by: Lei, Mingcong, et al.
Published: (2025)

WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration
by: Zhang, Yao, et al.
Published: (2024)

Long-Horizon Visual Imitation Learning via Plan and Code Reflection
by: Chen, Quan, et al.
Published: (2025)

Intrinsic Stability Limits of Autoregressive Reasoning: Structural Consequences for Long-Horizon Execution
by: Liao, Hsien-Jyh
Published: (2026)

When the Specification Emerges: Benchmarking Faithfulness Loss in Long-Horizon Coding Agents
by: Yan, Lu, et al.
Published: (2026)

The Conversations Beneath the Code: Triadic Data for Long-Horizon Software Engineering Agents
by: Kim, Yelin
Published: (2026)

A Thermodynamic Theory of Learning Part II: Critical Period Closure and Continual Learning Failure
by: Okanohara, Daisuke
Published: (2026)

A Thermodynamic Theory of Learning I: Irreversible Ensemble Transport and Epistemic Costs
by: Okanohara, Daisuke
Published: (2026)

Interaction as Intelligence Part II: Asynchronous Human-Agent Rollout for Long-Horizon Task Training
by: Fu, Dayuan, et al.
Published: (2025)

SpecBench: Measuring Reward Hacking in Long-Horizon Coding Agents
by: Zhao, Bingchen, et al.
Published: (2026)

InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery
by: Feng, Shiyang, et al.
Published: (2026)

STRUCTUREDAGENT: Planning with AND/OR Trees for Long-Horizon Web Tasks
by: Lobo, ELita, et al.
Published: (2026)