Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.15737 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866914942613979136 |
|---|---|
| author | Klimaszewski, Mateusz Andruszkiewicz, Piotr Birch, Alexandra |
| author_facet | Klimaszewski, Mateusz Andruszkiewicz, Piotr Birch, Alexandra |
| contents | Modular deep learning is the state-of-the-art solution for lifting the curse of multilinguality, preventing the impact of negative interference and enabling cross-lingual performance in Multilingual Pre-trained Language Models. However, a trade-off of this approach is the reduction in positive transfer learning from closely related languages. In response, we introduce a novel method called language arithmetic, which enables training-free post-processing to address this limitation. Extending the task arithmetic framework, we apply learning via addition to the language adapters, transitioning the framework from a multi-task to a multilingual setup. The effectiveness of the proposed solution is demonstrated on three downstream tasks in a MAD-X-based set of cross-lingual schemes, acting as a post-processing procedure. Language arithmetic consistently improves the baselines with significant gains, especially in the most challenging case of zero-shot application. Our code and models are available at https://github.com/mklimasz/language-arithmetic . |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2404_15737 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | No Train but Gain: Language Arithmetic for training-free Language Adapters enhancement Klimaszewski, Mateusz Andruszkiewicz, Piotr Birch, Alexandra Computation and Language Modular deep learning is the state-of-the-art solution for lifting the curse of multilinguality, preventing the impact of negative interference and enabling cross-lingual performance in Multilingual Pre-trained Language Models. However, a trade-off of this approach is the reduction in positive transfer learning from closely related languages. In response, we introduce a novel method called language arithmetic, which enables training-free post-processing to address this limitation. Extending the task arithmetic framework, we apply learning via addition to the language adapters, transitioning the framework from a multi-task to a multilingual setup. The effectiveness of the proposed solution is demonstrated on three downstream tasks in a MAD-X-based set of cross-lingual schemes, acting as a post-processing procedure. Language arithmetic consistently improves the baselines with significant gains, especially in the most challenging case of zero-shot application. Our code and models are available at https://github.com/mklimasz/language-arithmetic . |
| title | No Train but Gain: Language Arithmetic for training-free Language Adapters enhancement |
| topic | Computation and Language |
| url | https://arxiv.org/abs/2404.15737 |