Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Kavathia, Aarjav, Sayer, Simeon
Format:	Preprint
Publié:	2024
Sujets:	Computer Vision and Pattern Recognition
Accès en ligne:	https://arxiv.org/abs/2411.01348
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866912102831095808
author	Kavathia, Aarjav Sayer, Simeon
author_facet	Kavathia, Aarjav Sayer, Simeon
contents	As violent crimes continue to happen, it becomes necessary to have security cameras that can rapidly identify moments of violence with excellent accuracy. The purpose of this study is to identify how many frames should be analyzed at a time in order to optimize a violence detection model's accuracy as a parameter of the depth of a 3D convolutional network. Previous violence classification models have been created, but their application to live footage may be flawed. In this project, a convolutional neural network was created to analyze optical flow frames of each video. The number of frames analyzed at a time would vary with one, two, three, ten, and twenty frames, and each model would be trained for 20 epochs. The greatest validation accuracy was 94.87% and occurred with the model that analyzed three frames at a time. This means that machine learning models to detect violence may function better when analyzing three frames at a time for this dataset. The methodology used to identify the optimal number of frames to analyze at a time could be used in other applications of video classification, especially those of complex or abstract actions, such as violence.
format	Preprint
id	arxiv_https___arxiv_org_abs_2411_01348
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Optimizing Violence Detection in Video Classification Accuracy through 3D Convolutional Neural Networks Kavathia, Aarjav Sayer, Simeon Computer Vision and Pattern Recognition As violent crimes continue to happen, it becomes necessary to have security cameras that can rapidly identify moments of violence with excellent accuracy. The purpose of this study is to identify how many frames should be analyzed at a time in order to optimize a violence detection model's accuracy as a parameter of the depth of a 3D convolutional network. Previous violence classification models have been created, but their application to live footage may be flawed. In this project, a convolutional neural network was created to analyze optical flow frames of each video. The number of frames analyzed at a time would vary with one, two, three, ten, and twenty frames, and each model would be trained for 20 epochs. The greatest validation accuracy was 94.87% and occurred with the model that analyzed three frames at a time. This means that machine learning models to detect violence may function better when analyzing three frames at a time for this dataset. The methodology used to identify the optimal number of frames to analyze at a time could be used in other applications of video classification, especially those of complex or abstract actions, such as violence.
title	Optimizing Violence Detection in Video Classification Accuracy through 3D Convolutional Neural Networks
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2411.01348

Documents similaires