Salvato in:
Dettagli Bibliografici
Autori principali: Mao, Haitao, Liu, Guangliang, Ma, Yao, Wang, Rongrong, Johnson, Kristen, Tang, Jiliang
Natura: Preprint
Pubblicazione: 2024
Soggetti:
Accesso online:https://arxiv.org/abs/2402.02212
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866915120613949440
author Mao, Haitao
Liu, Guangliang
Ma, Yao
Wang, Rongrong
Johnson, Kristen
Tang, Jiliang
author_facet Mao, Haitao
Liu, Guangliang
Ma, Yao
Wang, Rongrong
Johnson, Kristen
Tang, Jiliang
contents In-Context Learning (ICL) empowers Large Language Models (LLMs) with the ability to learn from a few examples provided in the prompt, enabling downstream generalization without the requirement for gradient updates. Despite encouragingly empirical success, the underlying mechanism of ICL remains unclear. Existing research remains ambiguous with various viewpoints, utilizing intuition-driven and ad-hoc technical solutions to interpret ICL. In this paper, we leverage a data generation perspective to reinterpret recent efforts from a systematic angle, demonstrating the potential broader usage of these popular technical solutions. For a conceptual definition, we rigorously adopt the terms of skill recognition and skill learning. Skill recognition selects one learned data generation function previously seen during pre-training while skill learning can learn new data generation functions from in-context data. Furthermore, we provide insights into the strengths and weaknesses of both abilities, emphasizing their commonalities through the perspective of data generation. This analysis suggests potential directions for future research.
format Preprint
id arxiv_https___arxiv_org_abs_2402_02212
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle A Survey to Recent Progress Towards Understanding In-Context Learning
Mao, Haitao
Liu, Guangliang
Ma, Yao
Wang, Rongrong
Johnson, Kristen
Tang, Jiliang
Computation and Language
In-Context Learning (ICL) empowers Large Language Models (LLMs) with the ability to learn from a few examples provided in the prompt, enabling downstream generalization without the requirement for gradient updates. Despite encouragingly empirical success, the underlying mechanism of ICL remains unclear. Existing research remains ambiguous with various viewpoints, utilizing intuition-driven and ad-hoc technical solutions to interpret ICL. In this paper, we leverage a data generation perspective to reinterpret recent efforts from a systematic angle, demonstrating the potential broader usage of these popular technical solutions. For a conceptual definition, we rigorously adopt the terms of skill recognition and skill learning. Skill recognition selects one learned data generation function previously seen during pre-training while skill learning can learn new data generation functions from in-context data. Furthermore, we provide insights into the strengths and weaknesses of both abilities, emphasizing their commonalities through the perspective of data generation. This analysis suggests potential directions for future research.
title A Survey to Recent Progress Towards Understanding In-Context Learning
topic Computation and Language
url https://arxiv.org/abs/2402.02212