Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.12472 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866910306018525184 |
|---|---|
| author | Lim, Valerie Ng, Kai Wen Lim, Kenneth |
| author_facet | Lim, Valerie Ng, Kai Wen Lim, Kenneth |
| contents | Natural Language Processing models like BERT can provide state-of-the-art word embeddings for downstream NLP tasks. However, these models yet to perform well on Semantic Textual Similarity, and may be too large to be deployed as lightweight edge applications. We seek to apply a suitable contrastive learning method based on the SimCSE paper, to a model architecture adapted from a knowledge distillation based model, DistilBERT, to address these two issues. Our final lightweight model DistilFace achieves an average of 72.1 in Spearman's correlation on STS tasks, a 34.2 percent improvement over BERT base. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2401_12472 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | Contrastive Learning in Distilled Models Lim, Valerie Ng, Kai Wen Lim, Kenneth Computation and Language Natural Language Processing models like BERT can provide state-of-the-art word embeddings for downstream NLP tasks. However, these models yet to perform well on Semantic Textual Similarity, and may be too large to be deployed as lightweight edge applications. We seek to apply a suitable contrastive learning method based on the SimCSE paper, to a model architecture adapted from a knowledge distillation based model, DistilBERT, to address these two issues. Our final lightweight model DistilFace achieves an average of 72.1 in Spearman's correlation on STS tasks, a 34.2 percent improvement over BERT base. |
| title | Contrastive Learning in Distilled Models |
| topic | Computation and Language |
| url | https://arxiv.org/abs/2401.12472 |