Enregistré dans:
| Auteurs principaux: | , , |
|---|---|
| Format: | Preprint |
| Publié: |
2026
|
| Sujets: | |
| Accès en ligne: | https://arxiv.org/abs/2604.07098 |
| Tags: |
Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
|
| _version_ | 1866911671464755200 |
|---|---|
| author | Akhtar, Ryyan Pahwa, Payal Arora, Monika |
| author_facet | Akhtar, Ryyan Pahwa, Payal Arora, Monika |
| contents | Large language models often fail on tasks they seem to already understand. In our experiments, this appears to be less about missing knowledge and more about certain internal circuits not being strongly activated during inference. We explore Selective Neuron Amplification, which increases the influence of task relevant neurons without changing the model's parameters. The method works at inference time and does not permanently alter the model. SNA helps mainly when the model is uncertain, while having low effect when the model is already confident. This suggests that some model failures are due to weak activation rather than lack of capability. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2604_07098 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Selective Neuron Amplification in Transformer Language Models Akhtar, Ryyan Pahwa, Payal Arora, Monika Machine Learning Computation and Language Large language models often fail on tasks they seem to already understand. In our experiments, this appears to be less about missing knowledge and more about certain internal circuits not being strongly activated during inference. We explore Selective Neuron Amplification, which increases the influence of task relevant neurons without changing the model's parameters. The method works at inference time and does not permanently alter the model. SNA helps mainly when the model is uncertain, while having low effect when the model is already confident. This suggests that some model failures are due to weak activation rather than lack of capability. |
| title | Selective Neuron Amplification in Transformer Language Models |
| topic | Machine Learning Computation and Language |
| url | https://arxiv.org/abs/2604.07098 |