MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Mao, Haitao, Liu, Guangliang, Ma, Yao, Wang, Rongrong, Johnson, Kristen, Tang, Jiliang
Natura:	Preprint
Pubblicazione:	2024
Soggetti:	Computation and Language
Accesso online:	https://arxiv.org/abs/2402.02212
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866915120613949440
author	Mao, Haitao Liu, Guangliang Ma, Yao Wang, Rongrong Johnson, Kristen Tang, Jiliang
author_facet	Mao, Haitao Liu, Guangliang Ma, Yao Wang, Rongrong Johnson, Kristen Tang, Jiliang
contents	In-Context Learning (ICL) empowers Large Language Models (LLMs) with the ability to learn from a few examples provided in the prompt, enabling downstream generalization without the requirement for gradient updates. Despite encouragingly empirical success, the underlying mechanism of ICL remains unclear. Existing research remains ambiguous with various viewpoints, utilizing intuition-driven and ad-hoc technical solutions to interpret ICL. In this paper, we leverage a data generation perspective to reinterpret recent efforts from a systematic angle, demonstrating the potential broader usage of these popular technical solutions. For a conceptual definition, we rigorously adopt the terms of skill recognition and skill learning. Skill recognition selects one learned data generation function previously seen during pre-training while skill learning can learn new data generation functions from in-context data. Furthermore, we provide insights into the strengths and weaknesses of both abilities, emphasizing their commonalities through the perspective of data generation. This analysis suggests potential directions for future research.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_02212
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	A Survey to Recent Progress Towards Understanding In-Context Learning Mao, Haitao Liu, Guangliang Ma, Yao Wang, Rongrong Johnson, Kristen Tang, Jiliang Computation and Language In-Context Learning (ICL) empowers Large Language Models (LLMs) with the ability to learn from a few examples provided in the prompt, enabling downstream generalization without the requirement for gradient updates. Despite encouragingly empirical success, the underlying mechanism of ICL remains unclear. Existing research remains ambiguous with various viewpoints, utilizing intuition-driven and ad-hoc technical solutions to interpret ICL. In this paper, we leverage a data generation perspective to reinterpret recent efforts from a systematic angle, demonstrating the potential broader usage of these popular technical solutions. For a conceptual definition, we rigorously adopt the terms of skill recognition and skill learning. Skill recognition selects one learned data generation function previously seen during pre-training while skill learning can learn new data generation functions from in-context data. Furthermore, we provide insights into the strengths and weaknesses of both abilities, emphasizing their commonalities through the perspective of data generation. This analysis suggests potential directions for future research.
title	A Survey to Recent Progress Towards Understanding In-Context Learning
topic	Computation and Language
url	https://arxiv.org/abs/2402.02212

Documenti analoghi