Saved in:
Bibliographic Details
Main Authors: Huang, Yuehao, Liu, Liang, Lei, Shuangming, Ma, Yukai, Su, Hao, Mei, Jianbiao, Zhao, Pengxiang, Gu, Yaqing, Liu, Yong, Lv, Jiajun
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.11334
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911105979252736
author Huang, Yuehao
Liu, Liang
Lei, Shuangming
Ma, Yukai
Su, Hao
Mei, Jianbiao
Zhao, Pengxiang
Gu, Yaqing
Liu, Yong
Lv, Jiajun
author_facet Huang, Yuehao
Liu, Liang
Lei, Shuangming
Ma, Yukai
Su, Hao
Mei, Jianbiao
Zhao, Pengxiang
Gu, Yaqing
Liu, Yong
Lv, Jiajun
contents Mobile robots are increasingly required to navigate and interact within unknown and unstructured environments to meet human demands. Demand-driven navigation (DDN) enables robots to identify and locate objects based on implicit human intent, even when object locations are unknown. However, traditional data-driven DDN methods rely on pre-collected data for model training and decision-making, limiting their generalization capability in unseen scenarios. In this paper, we propose CogDDN, a VLM-based framework that emulates the human cognitive and learning mechanisms by integrating fast and slow thinking systems and selectively identifying key objects essential to fulfilling user demands. CogDDN identifies appropriate target objects by semantically aligning detected objects with the given instructions. Furthermore, it incorporates a dual-process decision-making module, comprising a Heuristic Process for rapid, efficient decisions and an Analytic Process that analyzes past errors, accumulates them in a knowledge base, and continuously improves performance. Chain of Thought (CoT) reasoning strengthens the decision-making process. Extensive closed-loop evaluations on the AI2Thor simulator with the ProcThor dataset show that CogDDN outperforms single-view camera-only methods by 15\%, demonstrating significant improvements in navigation accuracy and adaptability. The project page is available at https://yuehaohuang.github.io/CogDDN/.
format Preprint
id arxiv_https___arxiv_org_abs_2507_11334
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking
Huang, Yuehao
Liu, Liang
Lei, Shuangming
Ma, Yukai
Su, Hao
Mei, Jianbiao
Zhao, Pengxiang
Gu, Yaqing
Liu, Yong
Lv, Jiajun
Artificial Intelligence
Robotics
I.2.9
Mobile robots are increasingly required to navigate and interact within unknown and unstructured environments to meet human demands. Demand-driven navigation (DDN) enables robots to identify and locate objects based on implicit human intent, even when object locations are unknown. However, traditional data-driven DDN methods rely on pre-collected data for model training and decision-making, limiting their generalization capability in unseen scenarios. In this paper, we propose CogDDN, a VLM-based framework that emulates the human cognitive and learning mechanisms by integrating fast and slow thinking systems and selectively identifying key objects essential to fulfilling user demands. CogDDN identifies appropriate target objects by semantically aligning detected objects with the given instructions. Furthermore, it incorporates a dual-process decision-making module, comprising a Heuristic Process for rapid, efficient decisions and an Analytic Process that analyzes past errors, accumulates them in a knowledge base, and continuously improves performance. Chain of Thought (CoT) reasoning strengthens the decision-making process. Extensive closed-loop evaluations on the AI2Thor simulator with the ProcThor dataset show that CogDDN outperforms single-view camera-only methods by 15\%, demonstrating significant improvements in navigation accuracy and adaptability. The project page is available at https://yuehaohuang.github.io/CogDDN/.
title CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking
topic Artificial Intelligence
Robotics
I.2.9
url https://arxiv.org/abs/2507.11334