Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.20726 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866908288048693248 |
|---|---|
| author | Duan, Yifei Shang, Raphael Liang, Deng Cai, Yongqiang |
| author_facet | Duan, Yifei Shang, Raphael Liang, Deng Cai, Yongqiang |
| contents | Language models can be viewed as functions that embed text into Euclidean space, where the quality of the embedding vectors directly determines model performance, training such neural networks involves various uncertainties. This paper focuses on improving the performance of pre-trained language models in zero-shot settings through a simple and easily implementable method. We propose a novel backward attention mechanism to enhance contextual information encoding. Evaluated on the Chinese Massive Text Embedding Benchmark (C-MTEB), our approach achieves significant improvements across multiple tasks, providing valuable insights for advancing zero-shot learning capabilities. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2502_20726 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition Duan, Yifei Shang, Raphael Liang, Deng Cai, Yongqiang Computation and Language Machine Learning Language models can be viewed as functions that embed text into Euclidean space, where the quality of the embedding vectors directly determines model performance, training such neural networks involves various uncertainties. This paper focuses on improving the performance of pre-trained language models in zero-shot settings through a simple and easily implementable method. We propose a novel backward attention mechanism to enhance contextual information encoding. Evaluated on the Chinese Massive Text Embedding Benchmark (C-MTEB), our approach achieves significant improvements across multiple tasks, providing valuable insights for advancing zero-shot learning capabilities. |
| title | Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition |
| topic | Computation and Language Machine Learning |
| url | https://arxiv.org/abs/2502.20726 |