Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Pastukhov, Sergey
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2504.04366
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910905073139712
author	Pastukhov, Sergey
author_facet	Pastukhov, Sergey
contents	We introduce a novel hierarchical reinforcement learning (HRL) framework that performs top-down recursive planning via learned subgoals, successfully applied to the complex combinatorial puzzle game Sokoban. Our approach constructs a six-level policy hierarchy, where each higher-level policy generates subgoals for the level below. All subgoals and policies are learned end-to-end from scratch, without any domain knowledge. Our results show that the agent can generate long action sequences from a single high-level call. While prior work has explored 2-3 level hierarchies and subgoal-based planning heuristics, we demonstrate that deep recursive goal decomposition can emerge purely from learning, and that such hierarchies can scale effectively to hard puzzle domains.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_04366
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Solving Sokoban using Hierarchical Reinforcement Learning with Landmarks Pastukhov, Sergey Artificial Intelligence We introduce a novel hierarchical reinforcement learning (HRL) framework that performs top-down recursive planning via learned subgoals, successfully applied to the complex combinatorial puzzle game Sokoban. Our approach constructs a six-level policy hierarchy, where each higher-level policy generates subgoals for the level below. All subgoals and policies are learned end-to-end from scratch, without any domain knowledge. Our results show that the agent can generate long action sequences from a single high-level call. While prior work has explored 2-3 level hierarchies and subgoal-based planning heuristics, we demonstrate that deep recursive goal decomposition can emerge purely from learning, and that such hierarchies can scale effectively to hard puzzle domains.
title	Solving Sokoban using Hierarchical Reinforcement Learning with Landmarks
topic	Artificial Intelligence
url	https://arxiv.org/abs/2504.04366

Similar Items