Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Li, Cen-Jhih, Bhaskara, Aditya
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2502.11439
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909704194621440
author	Li, Cen-Jhih Bhaskara, Aditya
author_facet	Li, Cen-Jhih Bhaskara, Aditya
contents	Fine-tuning is an important step in adapting foundation models such as large language models to downstream tasks. To make this step more accessible to users with limited computational budgets, it is crucial to develop fine-tuning methods that are memory and computationally efficient. Sparse Fine-tuning (SpFT) and Low-rank adaptation (LoRA) are two frameworks that have emerged for addressing this problem and have been adopted widely in practice. In this work, we develop a new SpFT framework, based on ideas from neural network pruning. At a high level, we first identify ``important'' neurons/nodes using feature importance metrics from network pruning (specifically, we use the structural pruning method), and then perform fine-tuning by restricting to weights involving these neurons. Experiments on common language tasks show our method improves SpFT's memory efficiency by 20-50\% while matching the accuracy of state-of-the-art methods like LoRA's variants.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_11439
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	An Efficient Sparse Fine-Tuning with Low Quantization Error via Neural Network Pruning Li, Cen-Jhih Bhaskara, Aditya Computation and Language Artificial Intelligence Machine Learning Fine-tuning is an important step in adapting foundation models such as large language models to downstream tasks. To make this step more accessible to users with limited computational budgets, it is crucial to develop fine-tuning methods that are memory and computationally efficient. Sparse Fine-tuning (SpFT) and Low-rank adaptation (LoRA) are two frameworks that have emerged for addressing this problem and have been adopted widely in practice. In this work, we develop a new SpFT framework, based on ideas from neural network pruning. At a high level, we first identify ``important'' neurons/nodes using feature importance metrics from network pruning (specifically, we use the structural pruning method), and then perform fine-tuning by restricting to weights involving these neurons. Experiments on common language tasks show our method improves SpFT's memory efficiency by 20-50\% while matching the accuracy of state-of-the-art methods like LoRA's variants.
title	An Efficient Sparse Fine-Tuning with Low Quantization Error via Neural Network Pruning
topic	Computation and Language Artificial Intelligence Machine Learning
url	https://arxiv.org/abs/2502.11439

Similar Items