Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Niu, Jingcheng, Liu, Andrew, Zhu, Zining, Penn, Gerald
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2405.02421
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913341187817472
author	Niu, Jingcheng Liu, Andrew Zhu, Zining Penn, Gerald
author_facet	Niu, Jingcheng Liu, Andrew Zhu, Zining Penn, Gerald
contents	We reassess the Knowledge Neuron (KN) Thesis: an interpretation of the mechanism underlying the ability of large language models to recall facts from a training corpus. This nascent thesis proposes that facts are recalled from the training corpus through the MLP weights in a manner resembling key-value memory, implying in effect that "knowledge" is stored in the network. Furthermore, by modifying the MLP modules, one can control the language model's generation of factual information. The plausibility of the KN thesis has been demonstrated by the success of KN-inspired model editing methods (Dai et al., 2022; Meng et al., 2022). We find that this thesis is, at best, an oversimplification. Not only have we found that we can edit the expression of certain linguistic phenomena using the same model editing methods but, through a more comprehensive evaluation, we have found that the KN thesis does not adequately explain the process of factual expression. While it is possible to argue that the MLP weights store complex patterns that are interpretable both syntactically and semantically, these patterns do not constitute "knowledge." To gain a more comprehensive understanding of the knowledge representation process, we must look beyond the MLP weights and explore recent models' complex layer structures and attention mechanisms.
format	Preprint
id	arxiv_https___arxiv_org_abs_2405_02421
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	What does the Knowledge Neuron Thesis Have to do with Knowledge? Niu, Jingcheng Liu, Andrew Zhu, Zining Penn, Gerald Computation and Language We reassess the Knowledge Neuron (KN) Thesis: an interpretation of the mechanism underlying the ability of large language models to recall facts from a training corpus. This nascent thesis proposes that facts are recalled from the training corpus through the MLP weights in a manner resembling key-value memory, implying in effect that "knowledge" is stored in the network. Furthermore, by modifying the MLP modules, one can control the language model's generation of factual information. The plausibility of the KN thesis has been demonstrated by the success of KN-inspired model editing methods (Dai et al., 2022; Meng et al., 2022). We find that this thesis is, at best, an oversimplification. Not only have we found that we can edit the expression of certain linguistic phenomena using the same model editing methods but, through a more comprehensive evaluation, we have found that the KN thesis does not adequately explain the process of factual expression. While it is possible to argue that the MLP weights store complex patterns that are interpretable both syntactically and semantically, these patterns do not constitute "knowledge." To gain a more comprehensive understanding of the knowledge representation process, we must look beyond the MLP weights and explore recent models' complex layer structures and attention mechanisms.
title	What does the Knowledge Neuron Thesis Have to do with Knowledge?
topic	Computation and Language
url	https://arxiv.org/abs/2405.02421

Similar Items