Saved in:
Bibliographic Details
Main Authors: Wang, Weixuan, Haddow, Barry, Wu, Minghao, Peng, Wei, Birch, Alexandra
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.09265
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912606641455104
author Wang, Weixuan
Haddow, Barry
Wu, Minghao
Peng, Wei
Birch, Alexandra
author_facet Wang, Weixuan
Haddow, Barry
Wu, Minghao
Peng, Wei
Birch, Alexandra
contents Large language models (LLMs) have revolutionized the field of natural language processing (NLP), and recent studies have aimed to understand their underlying mechanisms. However, most of this research is conducted within a monolingual setting, primarily focusing on English. Few studies have attempted to explore the internal workings of LLMs in multilingual settings. In this study, we aim to fill this research gap by examining how neuron activation is shared across tasks and languages. We classify neurons into four distinct categories based on their responses to a specific input across different languages: all-shared, partial-shared, specific, and non-activated. Building upon this categorisation, we conduct extensive experiments on three tasks across nine languages using several LLMs and present an in-depth analysis in this work. Our findings reveal that: (i) deactivating the all-shared neurons significantly decreases performance; (ii) the shared neurons play a vital role in generating responses, especially for the all-shared neurons; (iii) neuron activation patterns are highly sensitive and vary across tasks, LLMs, and languages. These findings shed light on the internal workings of multilingual LLMs and pave the way for future research. We release the code to foster research in this area.
format Preprint
id arxiv_https___arxiv_org_abs_2406_09265
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs
Wang, Weixuan
Haddow, Barry
Wu, Minghao
Peng, Wei
Birch, Alexandra
Computation and Language
Large language models (LLMs) have revolutionized the field of natural language processing (NLP), and recent studies have aimed to understand their underlying mechanisms. However, most of this research is conducted within a monolingual setting, primarily focusing on English. Few studies have attempted to explore the internal workings of LLMs in multilingual settings. In this study, we aim to fill this research gap by examining how neuron activation is shared across tasks and languages. We classify neurons into four distinct categories based on their responses to a specific input across different languages: all-shared, partial-shared, specific, and non-activated. Building upon this categorisation, we conduct extensive experiments on three tasks across nine languages using several LLMs and present an in-depth analysis in this work. Our findings reveal that: (i) deactivating the all-shared neurons significantly decreases performance; (ii) the shared neurons play a vital role in generating responses, especially for the all-shared neurons; (iii) neuron activation patterns are highly sensitive and vary across tasks, LLMs, and languages. These findings shed light on the internal workings of multilingual LLMs and pave the way for future research. We release the code to foster research in this area.
title Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs
topic Computation and Language
url https://arxiv.org/abs/2406.09265