Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lopes, Alexandre, Santos, Fernando Pereira dos, de Oliveira, Diulhio, Schiezaro, Mauricio, Pedrini, Helio
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2408.08250
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913468755476480
author	Lopes, Alexandre Santos, Fernando Pereira dos de Oliveira, Diulhio Schiezaro, Mauricio Pedrini, Helio
author_facet	Lopes, Alexandre Santos, Fernando Pereira dos de Oliveira, Diulhio Schiezaro, Mauricio Pedrini, Helio
contents	Deep neural networks have consistently represented the state of the art in most computer vision problems. In these scenarios, larger and more complex models have demonstrated superior performance to smaller architectures, especially when trained with plenty of representative data. With the recent adoption of Vision Transformer (ViT) based architectures and advanced Convolutional Neural Networks (CNNs), the total number of parameters of leading backbone architectures increased from 62M parameters in 2012 with AlexNet to 7B parameters in 2024 with AIM-7B. Consequently, deploying such deep architectures faces challenges in environments with processing and runtime constraints, particularly in embedded systems. This paper covers the main model compression techniques applied for computer vision tasks, enabling modern models to be used in embedded systems. We present the characteristics of compression subareas, compare different approaches, and discuss how to choose the best technique and expected variations when analyzing it on various embedded devices. We also share codes to assist researchers and new practitioners in overcoming initial implementation challenges for each subarea and present trends for Model Compression. Case studies for compression models are available at \href{https://github.com/venturusbr/cv-model-compression}{https://github.com/venturusbr/cv-model-compression}.
format	Preprint
id	arxiv_https___arxiv_org_abs_2408_08250
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Computer Vision Model Compression Techniques for Embedded Systems: A Survey Lopes, Alexandre Santos, Fernando Pereira dos de Oliveira, Diulhio Schiezaro, Mauricio Pedrini, Helio Computer Vision and Pattern Recognition Deep neural networks have consistently represented the state of the art in most computer vision problems. In these scenarios, larger and more complex models have demonstrated superior performance to smaller architectures, especially when trained with plenty of representative data. With the recent adoption of Vision Transformer (ViT) based architectures and advanced Convolutional Neural Networks (CNNs), the total number of parameters of leading backbone architectures increased from 62M parameters in 2012 with AlexNet to 7B parameters in 2024 with AIM-7B. Consequently, deploying such deep architectures faces challenges in environments with processing and runtime constraints, particularly in embedded systems. This paper covers the main model compression techniques applied for computer vision tasks, enabling modern models to be used in embedded systems. We present the characteristics of compression subareas, compare different approaches, and discuss how to choose the best technique and expected variations when analyzing it on various embedded devices. We also share codes to assist researchers and new practitioners in overcoming initial implementation challenges for each subarea and present trends for Model Compression. Case studies for compression models are available at \href{https://github.com/venturusbr/cv-model-compression}{https://github.com/venturusbr/cv-model-compression}.
title	Computer Vision Model Compression Techniques for Embedded Systems: A Survey
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2408.08250

Similar Items