Saved in:
Bibliographic Details
Main Authors: Lin, Zijun, Tang, Chao, Ye, Hanjing, Zhang, Hong
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.02698
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912258125201408
author Lin, Zijun
Tang, Chao
Ye, Hanjing
Zhang, Hong
author_facet Lin, Zijun
Tang, Chao
Ye, Hanjing
Zhang, Hong
contents Robotic instruction following tasks require seamless integration of visual perception, task planning, target localization, and motion execution. However, existing task planning methods for instruction following are either data-driven or underperform in zero-shot scenarios due to difficulties in grounding lengthy instructions into actionable plans under operational constraints. To address this, we propose FlowPlan, a structured multi-stage LLM workflow that elevates zero-shot pipeline and bridges the performance gap between zero-shot and data-driven in-context learning methods. By decomposing the planning process into modular stages--task information retrieval, language-level reasoning, symbolic-level planning, and logical evaluation--FlowPlan generates logically coherent action sequences while adhering to operational constraints and further extracts contextual guidance for precise instance-level target localization. Benchmarked on the ALFRED and validated in real-world applications, our method achieves competitive performance relative to data-driven in-context learning methods and demonstrates adaptability across diverse environments. This work advances zero-shot task planning in robotic systems without reliance on labeled data. Project website: https://instruction-following-project.github.io/.
format Preprint
id arxiv_https___arxiv_org_abs_2503_02698
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle FlowPlan: Zero-Shot Task Planning with LLM Flow Engineering for Robotic Instruction Following
Lin, Zijun
Tang, Chao
Ye, Hanjing
Zhang, Hong
Robotics
Robotic instruction following tasks require seamless integration of visual perception, task planning, target localization, and motion execution. However, existing task planning methods for instruction following are either data-driven or underperform in zero-shot scenarios due to difficulties in grounding lengthy instructions into actionable plans under operational constraints. To address this, we propose FlowPlan, a structured multi-stage LLM workflow that elevates zero-shot pipeline and bridges the performance gap between zero-shot and data-driven in-context learning methods. By decomposing the planning process into modular stages--task information retrieval, language-level reasoning, symbolic-level planning, and logical evaluation--FlowPlan generates logically coherent action sequences while adhering to operational constraints and further extracts contextual guidance for precise instance-level target localization. Benchmarked on the ALFRED and validated in real-world applications, our method achieves competitive performance relative to data-driven in-context learning methods and demonstrates adaptability across diverse environments. This work advances zero-shot task planning in robotic systems without reliance on labeled data. Project website: https://instruction-following-project.github.io/.
title FlowPlan: Zero-Shot Task Planning with LLM Flow Engineering for Robotic Instruction Following
topic Robotics
url https://arxiv.org/abs/2503.02698