Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wen, Di, Zheng, Junwei, Liu, Ruiping, Xu, Yi, Peng, Kunyu, Stiefelhagen, Rainer
Format:	Preprint
Published:	2025
Subjects:	Human-Computer Interaction Robotics
Online Access:	https://arxiv.org/abs/2507.21072
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909709493075968
author	Wen, Di Zheng, Junwei Liu, Ruiping Xu, Yi Peng, Kunyu Stiefelhagen, Rainer
author_facet	Wen, Di Zheng, Junwei Liu, Ruiping Xu, Yi Peng, Kunyu Stiefelhagen, Rainer
contents	Industrial assembly tasks increasingly demand rapid adaptation to complex procedures and varied components, yet are often conducted in environments with limited computing, connectivity, and strict privacy requirements. These constraints make conventional cloud-based or fully autonomous solutions impractical for factory deployment. This paper introduces a mobile-device-based assistant system for industrial training and operational support, enabling real-time, semi-hands-free interaction through on-device perception and voice interfaces. The system integrates lightweight object detection, speech recognition, and Retrieval-Augmented Generation (RAG) into a modular on-device pipeline that operates entirely on-device, enabling intuitive support for part handling and procedure understanding without relying on manual supervision or cloud services. To enable scalable training, we adopt an automated data construction pipeline and introduce a two-stage refinement strategy to improve visual robustness under domain shift. Experiments on our generated dataset, i.e., Gear8, demonstrate improved robustness to domain shift and common visual corruptions. A structured user study further confirms its practical viability, with positive user feedback on the clarity of the guidance and the quality of the interaction. These results indicate that our framework offers a deployable solution for real-time, privacy-preserving smart assistance in industrial environments. We will release the Gear8 dataset and source code upon acceptance.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_21072
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Snap, Segment, Deploy: A Visual Data and Detection Pipeline for Wearable Industrial Assistants Wen, Di Zheng, Junwei Liu, Ruiping Xu, Yi Peng, Kunyu Stiefelhagen, Rainer Human-Computer Interaction Robotics Industrial assembly tasks increasingly demand rapid adaptation to complex procedures and varied components, yet are often conducted in environments with limited computing, connectivity, and strict privacy requirements. These constraints make conventional cloud-based or fully autonomous solutions impractical for factory deployment. This paper introduces a mobile-device-based assistant system for industrial training and operational support, enabling real-time, semi-hands-free interaction through on-device perception and voice interfaces. The system integrates lightweight object detection, speech recognition, and Retrieval-Augmented Generation (RAG) into a modular on-device pipeline that operates entirely on-device, enabling intuitive support for part handling and procedure understanding without relying on manual supervision or cloud services. To enable scalable training, we adopt an automated data construction pipeline and introduce a two-stage refinement strategy to improve visual robustness under domain shift. Experiments on our generated dataset, i.e., Gear8, demonstrate improved robustness to domain shift and common visual corruptions. A structured user study further confirms its practical viability, with positive user feedback on the clarity of the guidance and the quality of the interaction. These results indicate that our framework offers a deployable solution for real-time, privacy-preserving smart assistance in industrial environments. We will release the Gear8 dataset and source code upon acceptance.
title	Snap, Segment, Deploy: A Visual Data and Detection Pipeline for Wearable Industrial Assistants
topic	Human-Computer Interaction Robotics
url	https://arxiv.org/abs/2507.21072

Similar Items