Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Xiao, Li, Miao, Wu, Ji
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2406.01976
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909215961907200
author	Zhang, Xiao Li, Miao Wu, Ji
author_facet	Zhang, Xiao Li, Miao Wu, Ji
contents	Language models can learn sophisticated language understanding skills from fitting raw text. They also unselectively learn useless corpus statistics and biases, especially during finetuning on domain-specific corpora. In this paper, we propose a simple modification to causal language modeling called conditional finetuning, which performs language modeling conditioned on a context. We show that a context can "explain away" certain corpus statistics and make the model avoid learning them. In this fashion, conditional finetuning achieves selective learning from a corpus, learning knowledge useful for downstream tasks while avoiding learning useless corpus statistics like topic biases. This selective learning effect leads to less forgetting and better stability-plasticity tradeoff in domain finetuning, potentially benefitting lifelong learning with language models.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_01976
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Conditional Language Learning with Context Zhang, Xiao Li, Miao Wu, Ji Computation and Language Language models can learn sophisticated language understanding skills from fitting raw text. They also unselectively learn useless corpus statistics and biases, especially during finetuning on domain-specific corpora. In this paper, we propose a simple modification to causal language modeling called conditional finetuning, which performs language modeling conditioned on a context. We show that a context can "explain away" certain corpus statistics and make the model avoid learning them. In this fashion, conditional finetuning achieves selective learning from a corpus, learning knowledge useful for downstream tasks while avoiding learning useless corpus statistics like topic biases. This selective learning effect leads to less forgetting and better stability-plasticity tradeoff in domain finetuning, potentially benefitting lifelong learning with language models.
title	Conditional Language Learning with Context
topic	Computation and Language
url	https://arxiv.org/abs/2406.01976

Similar Items