Saved in:
Bibliographic Details
Main Authors: Wu, Liang, Ma, X. -G.
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2403.09056
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911797745811456
author Wu, Liang
Ma, X. -G.
author_facet Wu, Liang
Ma, X. -G.
contents On modern industrial assembly lines, many intelligent algorithms have been developed to replace or supervise workers. However, we found that there were bottlenecks in both training datasets and real-time performance when deploying algorithms on actual assembly line. Therefore, we developed a promising strategy for expanding industrial datasets, which utilized large models with strong generalization abilities to achieve efficient, high-quality, and large-scale dataset expansion, solving the problem of insufficient and low-quality industrial datasets. We also applied this strategy to video action recognition. We proposed a method of converting hand action recognition problems into hand skeletal trajectory classification problems, which solved the real-time performance problem of industrial algorithms. In the "hand movements during wire insertion" scenarios on the actual assembly line, the accuracy of hand action recognition reached 98.8\%. We conducted detailed experimental analysis to demonstrate the effectiveness and superiority of the method, and deployed the entire process on Midea's actual assembly line.
format Preprint
id arxiv_https___arxiv_org_abs_2403_09056
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Leveraging Foundation Model Automatic Data Augmentation Strategies and Skeletal Points for Hands Action Recognition in Industrial Assembly Lines
Wu, Liang
Ma, X. -G.
Computer Vision and Pattern Recognition
On modern industrial assembly lines, many intelligent algorithms have been developed to replace or supervise workers. However, we found that there were bottlenecks in both training datasets and real-time performance when deploying algorithms on actual assembly line. Therefore, we developed a promising strategy for expanding industrial datasets, which utilized large models with strong generalization abilities to achieve efficient, high-quality, and large-scale dataset expansion, solving the problem of insufficient and low-quality industrial datasets. We also applied this strategy to video action recognition. We proposed a method of converting hand action recognition problems into hand skeletal trajectory classification problems, which solved the real-time performance problem of industrial algorithms. In the "hand movements during wire insertion" scenarios on the actual assembly line, the accuracy of hand action recognition reached 98.8\%. We conducted detailed experimental analysis to demonstrate the effectiveness and superiority of the method, and deployed the entire process on Midea's actual assembly line.
title Leveraging Foundation Model Automatic Data Augmentation Strategies and Skeletal Points for Hands Action Recognition in Industrial Assembly Lines
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2403.09056