Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Tamo, J. Ben, Carlander-Reuterfelt, Daniel, Rubin, Jonathan, Hong, Dezhi, Wang, Mingxian, Poliannikov, Oleg
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2601.20009
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908904425783296
author	Tamo, J. Ben Carlander-Reuterfelt, Daniel Rubin, Jonathan Hong, Dezhi Wang, Mingxian Poliannikov, Oleg
author_facet	Tamo, J. Ben Carlander-Reuterfelt, Daniel Rubin, Jonathan Hong, Dezhi Wang, Mingxian Poliannikov, Oleg
contents	Despite multilingual pretraining, large language models often struggle with non-English tasks, particularly in language control, the ability to respond in the intended language. We identify and characterize two key failure modes: the multilingual transfer bottleneck (correct language, incorrect task response) and the language consistency bottleneck (correct task response, wrong language). To systematically surface these issues, we design a four-scenario evaluation protocol spanning MMLU, MGSM, and XQuAD benchmarks. To probe these issues with interpretability, we extend logit lens analysis to track language probabilities layer by layer and compute cross-lingual semantic similarity of hidden states. The results reveal a three-phase internal structure: early layers align inputs into a shared semantic space, middle layers perform task reasoning, and late layers drive language-specific generation. Guided by these insights, we introduce selective fine-tuning of only the final layers responsible for language control. On Qwen-3-32B and Bloom-7.1B, this method achieves over 98 percent language consistency across six languages while fine-tuning only 3-5 percent of parameters, without sacrificing task accuracy. Importantly, this result is nearly identical to that of full-scope fine-tuning (for example, above 98 percent language consistency for both methods across all prompt scenarios) but uses a fraction of the computational resources. To the best of our knowledge, this is the first approach to leverage layer-localization of language control for efficient multilingual adaptation.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_20009
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them? Tamo, J. Ben Carlander-Reuterfelt, Daniel Rubin, Jonathan Hong, Dezhi Wang, Mingxian Poliannikov, Oleg Computation and Language Artificial Intelligence Machine Learning Despite multilingual pretraining, large language models often struggle with non-English tasks, particularly in language control, the ability to respond in the intended language. We identify and characterize two key failure modes: the multilingual transfer bottleneck (correct language, incorrect task response) and the language consistency bottleneck (correct task response, wrong language). To systematically surface these issues, we design a four-scenario evaluation protocol spanning MMLU, MGSM, and XQuAD benchmarks. To probe these issues with interpretability, we extend logit lens analysis to track language probabilities layer by layer and compute cross-lingual semantic similarity of hidden states. The results reveal a three-phase internal structure: early layers align inputs into a shared semantic space, middle layers perform task reasoning, and late layers drive language-specific generation. Guided by these insights, we introduce selective fine-tuning of only the final layers responsible for language control. On Qwen-3-32B and Bloom-7.1B, this method achieves over 98 percent language consistency across six languages while fine-tuning only 3-5 percent of parameters, without sacrificing task accuracy. Importantly, this result is nearly identical to that of full-scope fine-tuning (for example, above 98 percent language consistency for both methods across all prompt scenarios) but uses a fraction of the computational resources. To the best of our knowledge, this is the first approach to leverage layer-localization of language control for efficient multilingual adaptation.
title	LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?
topic	Computation and Language Artificial Intelligence Machine Learning
url	https://arxiv.org/abs/2601.20009

Similar Items