Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Brinkmann, Jannik, Wendler, Chris, Bartelt, Christian, Mueller, Aaron
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2501.06346
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916753595957248
author	Brinkmann, Jannik Wendler, Chris Bartelt, Christian Mueller, Aaron
author_facet	Brinkmann, Jannik Wendler, Chris Bartelt, Christian Mueller, Aaron
contents	Human bilinguals often use similar brain regions to process multiple languages, depending on when they learned their second language and their proficiency. In large language models (LLMs), how are multiple languages learned and encoded? In this work, we explore the extent to which LLMs share representations of morphsyntactic concepts such as grammatical number, gender, and tense across languages. We train sparse autoencoders on Llama-3-8B and Aya-23-8B, and demonstrate that abstract grammatical concepts are often encoded in feature directions shared across many languages. We use causal interventions to verify the multilingual nature of these representations; specifically, we show that ablating only multilingual features decreases classifier performance to near-chance across languages. We then use these features to precisely modify model behavior in a machine translation task; this demonstrates both the generality and selectivity of these feature's roles in the network. Our findings suggest that even models trained predominantly on English data can develop robust, cross-lingual abstractions of morphosyntactic concepts.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_06346
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages Brinkmann, Jannik Wendler, Chris Bartelt, Christian Mueller, Aaron Computation and Language Human bilinguals often use similar brain regions to process multiple languages, depending on when they learned their second language and their proficiency. In large language models (LLMs), how are multiple languages learned and encoded? In this work, we explore the extent to which LLMs share representations of morphsyntactic concepts such as grammatical number, gender, and tense across languages. We train sparse autoencoders on Llama-3-8B and Aya-23-8B, and demonstrate that abstract grammatical concepts are often encoded in feature directions shared across many languages. We use causal interventions to verify the multilingual nature of these representations; specifically, we show that ablating only multilingual features decreases classifier performance to near-chance across languages. We then use these features to precisely modify model behavior in a machine translation task; this demonstrates both the generality and selectivity of these feature's roles in the network. Our findings suggest that even models trained predominantly on English data can develop robust, cross-lingual abstractions of morphosyntactic concepts.
title	Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages
topic	Computation and Language
url	https://arxiv.org/abs/2501.06346

Similar Items