Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Li, Yuqi, Lu, Yao, Dong, Junhao, Dong, Zeyu, Yang, Chuanguang, Yin, Xin, Chen, Yihao, Gou, Jianping, Tian, Yingli, Huang, Tingwen
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2410.14720
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915616963690496
author	Li, Yuqi Lu, Yao Dong, Junhao Dong, Zeyu Yang, Chuanguang Yin, Xin Chen, Yihao Gou, Jianping Tian, Yingli Huang, Tingwen
author_facet	Li, Yuqi Lu, Yao Dong, Junhao Dong, Zeyu Yang, Chuanguang Yin, Xin Chen, Yihao Gou, Jianping Tian, Yingli Huang, Tingwen
contents	Layer pruning has emerged as a potent approach to remove redundant layers in the pre-trained network on the purpose of reducing network size and improve computational efficiency. However, existing layer pruning methods mostly overlook the intrinsic connections and inter-dependencies between different layers within complicated deep neural networks. This oversight can result in pruned models that do not preserve the essential characteristics of the pre-trained network as effectively as desired. To address these limitations, we propose a Similarity-Guided Layer Partition (SGLP) Pruning, a novel pruning framework that exploits representation similarity to guide efficient and informed layer removal for compressing large deep models. Our method begins by employing Centered Kernel Alignment (CKA) to quantify representational similarity between layers, uncovering structural patterns within the network. We then apply Fisher Optimal Segmentation on the similarity matrix to partition the network into semantically coherent layer segments. This segmentation allows pruning decisions to respect layer interdependencies and preserve essential knowledge. Within each segment, we introduce a fine-tuning-free importance evaluation using GradNorm, identifying and removing redundant layers in a targeted, segment-wise manner. Experimental results on both image classification tasks and large language models (LLMs) demonstrate that our proposed SGLP outperforms the state-of-the-art methods in accuracy and efficiency. Our approach achieves significant model compression with minimal performance degradation, making it well-suited for deployment in resource-limited environments.
format	Preprint
id	arxiv_https___arxiv_org_abs_2410_14720
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	SGLP: A Similarity Guided Fast Layer Partition Pruning for Compressing Large Deep Models Li, Yuqi Lu, Yao Dong, Junhao Dong, Zeyu Yang, Chuanguang Yin, Xin Chen, Yihao Gou, Jianping Tian, Yingli Huang, Tingwen Machine Learning Computer Vision and Pattern Recognition Layer pruning has emerged as a potent approach to remove redundant layers in the pre-trained network on the purpose of reducing network size and improve computational efficiency. However, existing layer pruning methods mostly overlook the intrinsic connections and inter-dependencies between different layers within complicated deep neural networks. This oversight can result in pruned models that do not preserve the essential characteristics of the pre-trained network as effectively as desired. To address these limitations, we propose a Similarity-Guided Layer Partition (SGLP) Pruning, a novel pruning framework that exploits representation similarity to guide efficient and informed layer removal for compressing large deep models. Our method begins by employing Centered Kernel Alignment (CKA) to quantify representational similarity between layers, uncovering structural patterns within the network. We then apply Fisher Optimal Segmentation on the similarity matrix to partition the network into semantically coherent layer segments. This segmentation allows pruning decisions to respect layer interdependencies and preserve essential knowledge. Within each segment, we introduce a fine-tuning-free importance evaluation using GradNorm, identifying and removing redundant layers in a targeted, segment-wise manner. Experimental results on both image classification tasks and large language models (LLMs) demonstrate that our proposed SGLP outperforms the state-of-the-art methods in accuracy and efficiency. Our approach achieves significant model compression with minimal performance degradation, making it well-suited for deployment in resource-limited environments.
title	SGLP: A Similarity Guided Fast Layer Partition Pruning for Compressing Large Deep Models
topic	Machine Learning Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2410.14720

Similar Items