Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Liu, Zeyi, Liu, Shuang, Min, Jihai, Zhang, Zhaoheng, Cen, Jun, Han, Pengyu, Hu, Songqiao, Meng, Zihan, He, Xiao, Zhou, Donghua
Format:	Preprint
Published:	2026
Subjects:	Robotics Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2601.21173
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914289333305344
author	Liu, Zeyi Liu, Shuang Min, Jihai Zhang, Zhaoheng Cen, Jun Han, Pengyu Hu, Songqiao Meng, Zihan He, Xiao Zhou, Donghua
author_facet	Liu, Zeyi Liu, Shuang Min, Jihai Zhang, Zhaoheng Cen, Jun Han, Pengyu Hu, Songqiao Meng, Zihan He, Xiao Zhou, Donghua
contents	With the rapid development of industrial intelligence and unmanned inspection, reliable perception and safety assessment for AI systems in complex and dynamic industrial sites has become a key bottleneck for deploying predictive maintenance and autonomous inspection. Most public datasets remain limited by simulated data sources, single-modality sensing, or the absence of fine-grained object-level annotations, which prevents robust scene understanding and multimodal safety reasoning for industrial foundation models. To address these limitations, InspecSafe-V1 is released as the first multimodal benchmark dataset for industrial inspection safety assessment that is collected from routine operations of real inspection robots in real-world environments. InspecSafe-V1 covers five representative industrial scenarios, including tunnels, power facilities, sintering equipment, oil and gas petrochemical plants, and coal conveyor trestles. The dataset is constructed from 41 wheeled and rail-mounted inspection robots operating at 2,239 valid inspection sites, yielding 5,013 inspection instances. For each instance, pixel-level segmentation annotations are provided for key objects in visible-spectrum images. In addition, a semantic scene description and a corresponding safety level label are provided according to practical inspection tasks. Seven synchronized sensing modalities are further included, including infrared video, audio, depth point clouds, radar point clouds, gas measurements, temperature, and humidity, to support multimodal anomaly recognition, cross-modal fusion, and comprehensive safety assessment in industrial environments.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_21173
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	InspecSafe-V1: A Multimodal Benchmark for Safety Assessment in Industrial Inspection Scenarios Liu, Zeyi Liu, Shuang Min, Jihai Zhang, Zhaoheng Cen, Jun Han, Pengyu Hu, Songqiao Meng, Zihan He, Xiao Zhou, Donghua Robotics Computer Vision and Pattern Recognition With the rapid development of industrial intelligence and unmanned inspection, reliable perception and safety assessment for AI systems in complex and dynamic industrial sites has become a key bottleneck for deploying predictive maintenance and autonomous inspection. Most public datasets remain limited by simulated data sources, single-modality sensing, or the absence of fine-grained object-level annotations, which prevents robust scene understanding and multimodal safety reasoning for industrial foundation models. To address these limitations, InspecSafe-V1 is released as the first multimodal benchmark dataset for industrial inspection safety assessment that is collected from routine operations of real inspection robots in real-world environments. InspecSafe-V1 covers five representative industrial scenarios, including tunnels, power facilities, sintering equipment, oil and gas petrochemical plants, and coal conveyor trestles. The dataset is constructed from 41 wheeled and rail-mounted inspection robots operating at 2,239 valid inspection sites, yielding 5,013 inspection instances. For each instance, pixel-level segmentation annotations are provided for key objects in visible-spectrum images. In addition, a semantic scene description and a corresponding safety level label are provided according to practical inspection tasks. Seven synchronized sensing modalities are further included, including infrared video, audio, depth point clouds, radar point clouds, gas measurements, temperature, and humidity, to support multimodal anomaly recognition, cross-modal fusion, and comprehensive safety assessment in industrial environments.
title	InspecSafe-V1: A Multimodal Benchmark for Safety Assessment in Industrial Inspection Scenarios
topic	Robotics Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2601.21173

Similar Items