Saved in:
Bibliographic Details
Main Authors: Chen, Feiyang, Luo, Ziqian, Zhou, Lisang, Pan, Xueting, Jiang, Ying
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2404.10407
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909171365969920
author Chen, Feiyang
Luo, Ziqian
Zhou, Lisang
Pan, Xueting
Jiang, Ying
author_facet Chen, Feiyang
Luo, Ziqian
Zhou, Lisang
Pan, Xueting
Jiang, Ying
contents Vision Transformers (ViT) have marked a paradigm shift in computer vision, outperforming state-of-the-art models across diverse tasks. However, their practical deployment is hampered by high computational and memory demands. This study addresses the challenge by evaluating four primary model compression techniques: quantization, low-rank approximation, knowledge distillation, and pruning. We methodically analyze and compare the efficacy of these techniques and their combinations in optimizing ViTs for resource-constrained environments. Our comprehensive experimental evaluation demonstrates that these methods facilitate a balanced compromise between model accuracy and computational efficiency, paving the way for wider application in edge computing devices.
format Preprint
id arxiv_https___arxiv_org_abs_2404_10407
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Comprehensive Survey of Model Compression and Speed up for Vision Transformers
Chen, Feiyang
Luo, Ziqian
Zhou, Lisang
Pan, Xueting
Jiang, Ying
Computer Vision and Pattern Recognition
Vision Transformers (ViT) have marked a paradigm shift in computer vision, outperforming state-of-the-art models across diverse tasks. However, their practical deployment is hampered by high computational and memory demands. This study addresses the challenge by evaluating four primary model compression techniques: quantization, low-rank approximation, knowledge distillation, and pruning. We methodically analyze and compare the efficacy of these techniques and their combinations in optimizing ViTs for resource-constrained environments. Our comprehensive experimental evaluation demonstrates that these methods facilitate a balanced compromise between model accuracy and computational efficiency, paving the way for wider application in edge computing devices.
title Comprehensive Survey of Model Compression and Speed up for Vision Transformers
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2404.10407