Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Arora, Samir, Wang, Liangliang
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2405.00201
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909186046033920
author	Arora, Samir Wang, Liangliang
author_facet	Arora, Samir Wang, Liangliang
contents	Full fine-tuning is a popular approach to adapt Transformer-based pre-trained large language models to a specific downstream task. However, the substantial requirements for computational power and storage have discouraged its widespread use. Moreover, increasing evidence of catastrophic forgetting and overparameterization in the Transformer architecture has motivated researchers to seek more efficient fine-tuning (PEFT) methods. Commonly known parameter-efficient fine-tuning methods like LoRA and BitFit are typically applied across all layers of the model. We propose a PEFT method, called Stratified Progressive Adaptation Fine-tuning (SPAFIT), based on the localization of different types of linguistic knowledge to specific layers of the model. Our experiments, conducted on nine tasks from the GLUE benchmark, show that our proposed SPAFIT method outperforms other PEFT methods while fine-tuning only a fraction of the parameters adjusted by other methods.
format	Preprint
id	arxiv_https___arxiv_org_abs_2405_00201
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models Arora, Samir Wang, Liangliang Computation and Language Artificial Intelligence Full fine-tuning is a popular approach to adapt Transformer-based pre-trained large language models to a specific downstream task. However, the substantial requirements for computational power and storage have discouraged its widespread use. Moreover, increasing evidence of catastrophic forgetting and overparameterization in the Transformer architecture has motivated researchers to seek more efficient fine-tuning (PEFT) methods. Commonly known parameter-efficient fine-tuning methods like LoRA and BitFit are typically applied across all layers of the model. We propose a PEFT method, called Stratified Progressive Adaptation Fine-tuning (SPAFIT), based on the localization of different types of linguistic knowledge to specific layers of the model. Our experiments, conducted on nine tasks from the GLUE benchmark, show that our proposed SPAFIT method outperforms other PEFT methods while fine-tuning only a fraction of the parameters adjusted by other methods.
title	SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2405.00201

Similar Items