Saved in:
Bibliographic Details
Main Authors: Yin, Xu, Yoon, Min-Sung, Huo, Yuchi, Zhang, Kang, Yoon, Sung-Eui
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.09893
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Task execution for object rearrangement could be challenged by Task-Level Perturbations (TLP), i.e., unexpected object additions, removals, and displacements that can disrupt underlying visual policies and fundamentally compromise task feasibility and progress. To address these challenges, we present LangPert, a language-based framework designed to detect and mitigate TLP situations in tabletop rearrangement tasks. LangPert integrates a Visual Language Model (VLM) to comprehensively monitor policy's skill execution and environmental TLP, while leveraging the Hierarchical Chain-of-Thought (HCoT) reasoning mechanism to enhance the Large Language Model (LLM)'s contextual understanding and generate adaptive, corrective skill-execution plans. Our experimental results demonstrate that LangPert handles diverse TLP situations more effectively than baseline methods, achieving higher task completion rates, improved execution efficiency, and potential generalization to unseen scenarios.