Saved in:
Bibliographic Details
Main Authors: Lim, Valerie, Ng, Kai Wen, Lim, Kenneth
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2401.12472
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910306018525184
author Lim, Valerie
Ng, Kai Wen
Lim, Kenneth
author_facet Lim, Valerie
Ng, Kai Wen
Lim, Kenneth
contents Natural Language Processing models like BERT can provide state-of-the-art word embeddings for downstream NLP tasks. However, these models yet to perform well on Semantic Textual Similarity, and may be too large to be deployed as lightweight edge applications. We seek to apply a suitable contrastive learning method based on the SimCSE paper, to a model architecture adapted from a knowledge distillation based model, DistilBERT, to address these two issues. Our final lightweight model DistilFace achieves an average of 72.1 in Spearman's correlation on STS tasks, a 34.2 percent improvement over BERT base.
format Preprint
id arxiv_https___arxiv_org_abs_2401_12472
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Contrastive Learning in Distilled Models
Lim, Valerie
Ng, Kai Wen
Lim, Kenneth
Computation and Language
Natural Language Processing models like BERT can provide state-of-the-art word embeddings for downstream NLP tasks. However, these models yet to perform well on Semantic Textual Similarity, and may be too large to be deployed as lightweight edge applications. We seek to apply a suitable contrastive learning method based on the SimCSE paper, to a model architecture adapted from a knowledge distillation based model, DistilBERT, to address these two issues. Our final lightweight model DistilFace achieves an average of 72.1 in Spearman's correlation on STS tasks, a 34.2 percent improvement over BERT base.
title Contrastive Learning in Distilled Models
topic Computation and Language
url https://arxiv.org/abs/2401.12472