Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Rahamim, Adir, Saphra, Naomi, Kangaslahti, Sara, Belinkov, Yonatan
Format:	Preprint
Veröffentlicht:	2024
Schlagworte:	Machine Learning Computation and Language
Online-Zugang:	https://arxiv.org/abs/2409.04206
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866929489889460224
author	Rahamim, Adir Saphra, Naomi Kangaslahti, Sara Belinkov, Yonatan
author_facet	Rahamim, Adir Saphra, Naomi Kangaslahti, Sara Belinkov, Yonatan
contents	Parameter efficient finetuning methods like low-rank adaptation (LoRA) aim to reduce the computational costs of finetuning pretrained Language Models (LMs). Enabled by these low-rank settings, we propose an even more efficient optimization strategy: Fast Forward, a simple and effective approach to accelerate large segments of training. In a Fast Forward stage, we repeat the most recent optimizer step until the loss stops improving on a tiny validation set. By alternating between regular optimization steps and Fast Forward stages, Fast Forward provides up to an 87\% reduction in FLOPs and up to an 81\% reduction in train time over standard SGD with Adam. We validate Fast Forward by finetuning various models on different tasks and demonstrate that it speeds up training without compromising model performance. Additionally, we analyze when and how to apply Fast Forward.
format	Preprint
id	arxiv_https___arxiv_org_abs_2409_04206
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Fast Forwarding Low-Rank Training Rahamim, Adir Saphra, Naomi Kangaslahti, Sara Belinkov, Yonatan Machine Learning Computation and Language Parameter efficient finetuning methods like low-rank adaptation (LoRA) aim to reduce the computational costs of finetuning pretrained Language Models (LMs). Enabled by these low-rank settings, we propose an even more efficient optimization strategy: Fast Forward, a simple and effective approach to accelerate large segments of training. In a Fast Forward stage, we repeat the most recent optimizer step until the loss stops improving on a tiny validation set. By alternating between regular optimization steps and Fast Forward stages, Fast Forward provides up to an 87\% reduction in FLOPs and up to an 81\% reduction in train time over standard SGD with Adam. We validate Fast Forward by finetuning various models on different tasks and demonstrate that it speeds up training without compromising model performance. Additionally, we analyze when and how to apply Fast Forward.
title	Fast Forwarding Low-Rank Training
topic	Machine Learning Computation and Language
url	https://arxiv.org/abs/2409.04206

Ähnliche Einträge