Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhou, Jin, Yang, Hanmei, Steven, Tang, Xiang, Mingcan, Guan, Hui, Liu, Tongping
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2410.15651
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Fine-tuning with Reinforcement Learning with Human Feedback (RLHF) is essential for aligning large language models (LLMs). However, RLHF often encounters significant memory challenges. This study is the first to examine memory usage in the RLHF context, exploring various memory management strategies and unveiling the reasons behind excessive memory consumption. Additionally, we introduce a simple yet effective approach that substantially reduces the memory required for RLHF fine-tuning.

Similar Items