Saved in:
Bibliographic Details
Main Authors: Jang, Seongju, Baek, Francis, Lee, SangHyun
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.22903
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911541202255872
author Jang, Seongju
Baek, Francis
Lee, SangHyun
author_facet Jang, Seongju
Baek, Francis
Lee, SangHyun
contents Due to the ever-changing nature of construction, many tasks on sites occur in an improvisational manner. Existing mobile construction robot studies remain limited in addressing improvisational tasks, where task-required locations, timing of task occurrence, and contextual information required for task execution are not known in advance. We propose an agent that understands improvisational tasks given in natural language, identifies the task-required location, and positions itself. The agent's functionality was decomposed into three Large Multimodal Model (LMM) modules operating in parallel, enabling the application of LMMs for task interpretation and breakdown, construction drawing-based navigation, and visual reasoning to identify non-predefined task-required locations. The agent was implemented with a quadruped robot and achieved a 92.2% success rate for identifying and positioning at task-required locations across three tests designed to assess improvisational task handling. This study enables mobile construction robots to perform non-predefined tasks autonomously.
format Preprint
id arxiv_https___arxiv_org_abs_2603_22903
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Task-Aware Positioning for Improvisational Tasks in Mobile Construction Robots via an AI Agent with Multi-LMM Modules
Jang, Seongju
Baek, Francis
Lee, SangHyun
Robotics
Due to the ever-changing nature of construction, many tasks on sites occur in an improvisational manner. Existing mobile construction robot studies remain limited in addressing improvisational tasks, where task-required locations, timing of task occurrence, and contextual information required for task execution are not known in advance. We propose an agent that understands improvisational tasks given in natural language, identifies the task-required location, and positions itself. The agent's functionality was decomposed into three Large Multimodal Model (LMM) modules operating in parallel, enabling the application of LMMs for task interpretation and breakdown, construction drawing-based navigation, and visual reasoning to identify non-predefined task-required locations. The agent was implemented with a quadruped robot and achieved a 92.2% success rate for identifying and positioning at task-required locations across three tests designed to assess improvisational task handling. This study enables mobile construction robots to perform non-predefined tasks autonomously.
title Task-Aware Positioning for Improvisational Tasks in Mobile Construction Robots via an AI Agent with Multi-LMM Modules
topic Robotics
url https://arxiv.org/abs/2603.22903