Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Kudriashov, Sergei, Zykova, Veronika, Stepanova, Angelina, Raskind, Yakov, Klyshinsky, Eduard
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2501.05503
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913643072847872
author	Kudriashov, Sergei Zykova, Veronika Stepanova, Angelina Raskind, Yakov Klyshinsky, Eduard
author_facet	Kudriashov, Sergei Zykova, Veronika Stepanova, Angelina Raskind, Yakov Klyshinsky, Eduard
contents	The interpretation of deep learning models is a rapidly growing field, with particular interest in language models. There are various approaches to this task, including training simpler models to replicate neural network predictions and analyzing the latent space of the model. The latter method allows us to not only identify patterns in the model's decision-making process, but also understand the features of its internal structure. In this paper, we analyze the changes in the internal representation of the BERT model when it is trained with additional grammatical modules and data containing new grammatical structures (polypersonality). We find that adding a single grammatical layer causes the model to separate the new and old grammatical systems within itself, improving the overall performance on perplexity metrics.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_05503
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	The more polypersonal the better -- a short look on space geometry of fine-tuned layers Kudriashov, Sergei Zykova, Veronika Stepanova, Angelina Raskind, Yakov Klyshinsky, Eduard Computation and Language Machine Learning The interpretation of deep learning models is a rapidly growing field, with particular interest in language models. There are various approaches to this task, including training simpler models to replicate neural network predictions and analyzing the latent space of the model. The latter method allows us to not only identify patterns in the model's decision-making process, but also understand the features of its internal structure. In this paper, we analyze the changes in the internal representation of the BERT model when it is trained with additional grammatical modules and data containing new grammatical structures (polypersonality). We find that adding a single grammatical layer causes the model to separate the new and old grammatical systems within itself, improving the overall performance on perplexity metrics.
title	The more polypersonal the better -- a short look on space geometry of fine-tuned layers
topic	Computation and Language Machine Learning
url	https://arxiv.org/abs/2501.05503

Similar Items