Saved in:
Bibliographic Details
Main Authors: Hartford, Eric, Atkins, Lucas, Neto, Fernando Fernandes, Golchinfar, David
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.06623
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Efficiently post-training large language models remains a challenging task due to the vast computational resources required. We present Spectrum, a method that accelerates LLM training by selectively targeting layer modules based on their signal-to-noise ratio (SNR), and freezing the remaining modules. Our approach, which utilizes an algorithm to compute module SNRs prior to training, has shown to effectively match the performance of full fine-tuning while reducing GPU memory usage. Experiments comparing Spectrum to existing methods such as QLoRA demonstrate its effectiveness in terms of model quality and VRAM efficiency in distributed environments.