Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Chen, Lichang, Li, Shiyang, Yan, Jun, Wang, Hai, Gunaratna, Kalpa, Yadav, Vikas, Tang, Zheng, Srinivasan, Vijay, Zhou, Tianyi, Huang, Heng, Jin, Hongxia
Format:	Preprint
Published:	2023
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2307.08701
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909104317923328
author	Chen, Lichang Li, Shiyang Yan, Jun Wang, Hai Gunaratna, Kalpa Yadav, Vikas Tang, Zheng Srinivasan, Vijay Zhou, Tianyi Huang, Heng Jin, Hongxia
author_facet	Chen, Lichang Li, Shiyang Yan, Jun Wang, Hai Gunaratna, Kalpa Yadav, Vikas Tang, Zheng Srinivasan, Vijay Zhou, Tianyi Huang, Heng Jin, Hongxia
contents	Large language models (LLMs) strengthen instruction-following capability through instruction-finetuning (IFT) on supervised instruction/response data. However, widely used IFT datasets (e.g., Alpaca's 52k data) surprisingly contain many low-quality instances with incorrect or irrelevant responses, which are misleading and detrimental to IFT. In this paper, we propose a simple and effective data selection strategy that automatically identifies and filters out low-quality data using a strong LLM (e.g., ChatGPT). To this end, we introduce AlpaGasus, which is finetuned on only 9k high-quality data filtered from the 52k Alpaca data. AlpaGasus significantly outperforms the original Alpaca as evaluated by GPT-4 on multiple test sets and the controlled human evaluation. Its 13B variant matches $>90\%$ performance of its teacher LLM (i.e., Text-Davinci-003 generating the 52k data) on test tasks. It also provides 5.7x faster training, reducing the training time for a 7B variant from 80 minutes (for Alpaca) to 14 minutes. Moreover, the experiments prove the efficacy of our method across diverse datasets, base models, and LLM filters. Overall, AlpaGasus demonstrates a novel data-centric IFT paradigm that can be generally applied to instruction-tuning data, leading to faster training and better instruction-following models. Our project page is available at: https://lichang-chen.github.io/AlpaGasus/
format	Preprint
id	arxiv_https___arxiv_org_abs_2307_08701
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	AlpaGasus: Training A Better Alpaca with Fewer Data Chen, Lichang Li, Shiyang Yan, Jun Wang, Hai Gunaratna, Kalpa Yadav, Vikas Tang, Zheng Srinivasan, Vijay Zhou, Tianyi Huang, Heng Jin, Hongxia Computation and Language Large language models (LLMs) strengthen instruction-following capability through instruction-finetuning (IFT) on supervised instruction/response data. However, widely used IFT datasets (e.g., Alpaca's 52k data) surprisingly contain many low-quality instances with incorrect or irrelevant responses, which are misleading and detrimental to IFT. In this paper, we propose a simple and effective data selection strategy that automatically identifies and filters out low-quality data using a strong LLM (e.g., ChatGPT). To this end, we introduce AlpaGasus, which is finetuned on only 9k high-quality data filtered from the 52k Alpaca data. AlpaGasus significantly outperforms the original Alpaca as evaluated by GPT-4 on multiple test sets and the controlled human evaluation. Its 13B variant matches $>90\%$ performance of its teacher LLM (i.e., Text-Davinci-003 generating the 52k data) on test tasks. It also provides 5.7x faster training, reducing the training time for a 7B variant from 80 minutes (for Alpaca) to 14 minutes. Moreover, the experiments prove the efficacy of our method across diverse datasets, base models, and LLM filters. Overall, AlpaGasus demonstrates a novel data-centric IFT paradigm that can be generally applied to instruction-tuning data, leading to faster training and better instruction-following models. Our project page is available at: https://lichang-chen.github.io/AlpaGasus/
title	AlpaGasus: Training A Better Alpaca with Fewer Data
topic	Computation and Language
url	https://arxiv.org/abs/2307.08701

Similar Items