Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Hassan, Sabit, Chung, Hye-Young, Tan, Xiang Zhi, Alikhani, Malihe
Format:	Preprint
Published:	2024
Subjects:	Robotics Computation and Language
Online Access:	https://arxiv.org/abs/2410.14141
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913706450878464
author	Hassan, Sabit Chung, Hye-Young Tan, Xiang Zhi Alikhani, Malihe
author_facet	Hassan, Sabit Chung, Hye-Young Tan, Xiang Zhi Alikhani, Malihe
contents	When assisting people in daily tasks, robots need to accurately interpret visual cues and respond effectively in diverse safety-critical situations, such as sharp objects on the floor. In this context, we present M-CoDAL, a multimodal-dialogue system specifically designed for embodied agents to better understand and communicate in safety-critical situations. The system leverages discourse coherence relations to enhance its contextual understanding and communication abilities. To train this system, we introduce a novel clustering-based active learning mechanism that utilizes an external Large Language Model (LLM) to identify informative instances. Our approach is evaluated using a newly created multimodal dataset comprising 1K safety violations extracted from 2K Reddit images. These violations are annotated using a Large Multimodal Model (LMM) and verified by human annotators. Results with this dataset demonstrate that our approach improves resolution of safety situations, user sentiment, as well as safety of the conversation. Next, we deploy our dialogue system on a Hello Robot Stretch robot and conduct a within-subject user study with real-world participants. In the study, participants role-play two safety scenarios with different levels of severity with the robot and receive interventions from our model and a baseline system powered by OpenAI's ChatGPT. The study results corroborate and extend the findings from the automated evaluation, showing that our proposed system is more persuasive in a real-world embodied agent setting.
format	Preprint
id	arxiv_https___arxiv_org_abs_2410_14141
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Coherence-Driven Multimodal Safety Dialogue with Active Learning for Embodied Agents Hassan, Sabit Chung, Hye-Young Tan, Xiang Zhi Alikhani, Malihe Robotics Computation and Language When assisting people in daily tasks, robots need to accurately interpret visual cues and respond effectively in diverse safety-critical situations, such as sharp objects on the floor. In this context, we present M-CoDAL, a multimodal-dialogue system specifically designed for embodied agents to better understand and communicate in safety-critical situations. The system leverages discourse coherence relations to enhance its contextual understanding and communication abilities. To train this system, we introduce a novel clustering-based active learning mechanism that utilizes an external Large Language Model (LLM) to identify informative instances. Our approach is evaluated using a newly created multimodal dataset comprising 1K safety violations extracted from 2K Reddit images. These violations are annotated using a Large Multimodal Model (LMM) and verified by human annotators. Results with this dataset demonstrate that our approach improves resolution of safety situations, user sentiment, as well as safety of the conversation. Next, we deploy our dialogue system on a Hello Robot Stretch robot and conduct a within-subject user study with real-world participants. In the study, participants role-play two safety scenarios with different levels of severity with the robot and receive interventions from our model and a baseline system powered by OpenAI's ChatGPT. The study results corroborate and extend the findings from the automated evaluation, showing that our proposed system is more persuasive in a real-world embodied agent setting.
title	Coherence-Driven Multimodal Safety Dialogue with Active Learning for Embodied Agents
topic	Robotics Computation and Language
url	https://arxiv.org/abs/2410.14141

Similar Items