Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wang, Qianwei, Xu, Yifan, Kamat, Vineet, Menassa, Carol
Format:	Preprint
Published:	2025
Subjects:	Robotics
Online Access:	https://arxiv.org/abs/2503.02106
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910857255976960
author	Wang, Qianwei Xu, Yifan Kamat, Vineet Menassa, Carol
author_facet	Wang, Qianwei Xu, Yifan Kamat, Vineet Menassa, Carol
contents	Object search is a fundamental task for robots deployed in indoor building environments, yet challenges arise due to observation instability, especially for open-vocabulary models. While foundation models (LLMs/VLMs) enable reasoning about object locations even without direct visibility, the ability to recover from failures and replan remains crucial. The Multi-Object Search (MOS) problem further increases complexity, requiring the tracking multiple objects and thorough exploration in novel environments, making observation uncertainty a significant obstacle. To address these challenges, we propose a framework integrating VLM-based reasoning, frontier-based exploration, and a Partially Observable Markov Decision Process (POMDP) framework to solve the MOS problem in novel environments. VLM enhances search efficiency by inferring object-environment relationships, frontier-based exploration guides navigation in unknown spaces, and POMDP models observation uncertainty, allowing recovery from failures in occlusion and cluttered environments. We evaluate our framework on 120 simulated scenarios across several Habitat-Matterport3D (HM3D) scenes and a real-world robot experiment in a 50-square-meter office, demonstrating significant improvements in both efficiency and success rate over baseline methods.
format	Preprint
id	arxiv_https___arxiv_org_abs_2503_02106
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	OVAMOS: A Framework for Open-Vocabulary Multi-Object Search in Unknown Environments Wang, Qianwei Xu, Yifan Kamat, Vineet Menassa, Carol Robotics Object search is a fundamental task for robots deployed in indoor building environments, yet challenges arise due to observation instability, especially for open-vocabulary models. While foundation models (LLMs/VLMs) enable reasoning about object locations even without direct visibility, the ability to recover from failures and replan remains crucial. The Multi-Object Search (MOS) problem further increases complexity, requiring the tracking multiple objects and thorough exploration in novel environments, making observation uncertainty a significant obstacle. To address these challenges, we propose a framework integrating VLM-based reasoning, frontier-based exploration, and a Partially Observable Markov Decision Process (POMDP) framework to solve the MOS problem in novel environments. VLM enhances search efficiency by inferring object-environment relationships, frontier-based exploration guides navigation in unknown spaces, and POMDP models observation uncertainty, allowing recovery from failures in occlusion and cluttered environments. We evaluate our framework on 120 simulated scenarios across several Habitat-Matterport3D (HM3D) scenes and a real-world robot experiment in a 50-square-meter office, demonstrating significant improvements in both efficiency and success rate over baseline methods.
title	OVAMOS: A Framework for Open-Vocabulary Multi-Object Search in Unknown Environments
topic	Robotics
url	https://arxiv.org/abs/2503.02106

Similar Items