Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Ji-An, Li, Zhou, Corey Y., Benna, Marcus K., Mattar, Marcelo G.
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2405.14992
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910678674046976
author	Ji-An, Li Zhou, Corey Y. Benna, Marcus K. Mattar, Marcelo G.
author_facet	Ji-An, Li Zhou, Corey Y. Benna, Marcus K. Mattar, Marcelo G.
contents	Understanding connections between artificial and biological intelligent systems can reveal fundamental principles of general intelligence. While many artificial intelligence models have a neuroscience counterpart, such connections are largely missing in Transformer models and the self-attention mechanism. Here, we examine the relationship between interacting attention heads and human episodic memory. We focus on induction heads, which contribute to in-context learning in Transformer-based large language models (LLMs). We demonstrate that induction heads are behaviorally, functionally, and mechanistically similar to the contextual maintenance and retrieval (CMR) model of human episodic memory. Our analyses of LLMs pre-trained on extensive text data show that CMR-like heads often emerge in the intermediate and late layers, qualitatively mirroring human memory biases. The ablation of CMR-like heads suggests their causal role in in-context learning. Our findings uncover a parallel between the computational mechanisms of LLMs and human memory, offering valuable insights into both research fields.
format	Preprint
id	arxiv_https___arxiv_org_abs_2405_14992
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Linking In-context Learning in Transformers to Human Episodic Memory Ji-An, Li Zhou, Corey Y. Benna, Marcus K. Mattar, Marcelo G. Computation and Language Machine Learning Understanding connections between artificial and biological intelligent systems can reveal fundamental principles of general intelligence. While many artificial intelligence models have a neuroscience counterpart, such connections are largely missing in Transformer models and the self-attention mechanism. Here, we examine the relationship between interacting attention heads and human episodic memory. We focus on induction heads, which contribute to in-context learning in Transformer-based large language models (LLMs). We demonstrate that induction heads are behaviorally, functionally, and mechanistically similar to the contextual maintenance and retrieval (CMR) model of human episodic memory. Our analyses of LLMs pre-trained on extensive text data show that CMR-like heads often emerge in the intermediate and late layers, qualitatively mirroring human memory biases. The ablation of CMR-like heads suggests their causal role in in-context learning. Our findings uncover a parallel between the computational mechanisms of LLMs and human memory, offering valuable insights into both research fields.
title	Linking In-context Learning in Transformers to Human Episodic Memory
topic	Computation and Language Machine Learning
url	https://arxiv.org/abs/2405.14992

Similar Items