Saved in:
Bibliographic Details
Main Authors: Wang, Yi, Lu, Haojie, Zhang, Zhaofan, Chen, Li, Xie, Sihong
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.28144
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918526733778944
author Wang, Yi
Lu, Haojie
Zhang, Zhaofan
Chen, Li
Xie, Sihong
author_facet Wang, Yi
Lu, Haojie
Zhang, Zhaofan
Chen, Li
Xie, Sihong
contents LLMs have shown remarkable proficiency in general language understanding and reasoning. However, they consistently underperform in spatial reasoning that severely limits their application, particularly in embodied intelligence. Inspired by the success of hierarchical reinforcement learning, this paper introduces a novel method for hierarchical task decomposition in LLM spatial reasoning. Our approach guides LLMs to decompose complex tasks into manageable sub-tasks by identifying key intermediate states and generating simplified sub-environments. However, we identify that LLMs often fail to derive optimal intermediate states due to their insufficient spatial prior, leading to sub-optimal task decomposition. To address this limitation and enhance its planning capability, we propose the MCTS-Guided Group Relative Policy Optimization (M-GRPO), where we reformulate the UCT formula by incorporating the LLM's prior predictive probabilities alongside its epistemic uncertainty. Furthermore, we implement a more fine-grained advantage function, enabling the model to learn optimal path planning. Experimental results demonstrate that our method substantially improves LLM performance on spatial tasks, including navigation, planning, and strategic games, achieving state-of-the-art results. This work paves the way for LLMs in real-world applications.
format Preprint
id arxiv_https___arxiv_org_abs_2605_28144
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Deconstructing Spatial Complexity: Hierarchical Decomposition for LLM Spatial Reasoning
Wang, Yi
Lu, Haojie
Zhang, Zhaofan
Chen, Li
Xie, Sihong
Artificial Intelligence
LLMs have shown remarkable proficiency in general language understanding and reasoning. However, they consistently underperform in spatial reasoning that severely limits their application, particularly in embodied intelligence. Inspired by the success of hierarchical reinforcement learning, this paper introduces a novel method for hierarchical task decomposition in LLM spatial reasoning. Our approach guides LLMs to decompose complex tasks into manageable sub-tasks by identifying key intermediate states and generating simplified sub-environments. However, we identify that LLMs often fail to derive optimal intermediate states due to their insufficient spatial prior, leading to sub-optimal task decomposition. To address this limitation and enhance its planning capability, we propose the MCTS-Guided Group Relative Policy Optimization (M-GRPO), where we reformulate the UCT formula by incorporating the LLM's prior predictive probabilities alongside its epistemic uncertainty. Furthermore, we implement a more fine-grained advantage function, enabling the model to learn optimal path planning. Experimental results demonstrate that our method substantially improves LLM performance on spatial tasks, including navigation, planning, and strategic games, achieving state-of-the-art results. This work paves the way for LLMs in real-world applications.
title Deconstructing Spatial Complexity: Hierarchical Decomposition for LLM Spatial Reasoning
topic Artificial Intelligence
url https://arxiv.org/abs/2605.28144