Salvato in:
Dettagli Bibliografici
Autori principali: Xia, Linhan, Yang, Mingzhan, Yuan, Guohui, Tao, Shengnan, Qiu, Yujing, Yu, Guo, Lei, Kai
Natura: Preprint
Pubblicazione: 2025
Soggetti:
Accesso online:https://arxiv.org/abs/2506.00968
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866912408567545856
author Xia, Linhan
Yang, Mingzhan
Yuan, Guohui
Tao, Shengnan
Qiu, Yujing
Yu, Guo
Lei, Kai
author_facet Xia, Linhan
Yang, Mingzhan
Yuan, Guohui
Tao, Shengnan
Qiu, Yujing
Yu, Guo
Lei, Kai
contents Mainstream Word Sense Disambiguation (WSD) approaches have employed BERT to extract semantics from both context and definitions of senses to determine the most suitable sense of a target word, achieving notable performance. However, there are two limitations in these approaches. First, previous studies failed to balance the representation of token-level (local) and sequence-level (global) semantics during feature extraction, leading to insufficient semantic representation and a performance bottleneck. Second, these approaches incorporated all possible senses of each target word during the training phase, leading to unnecessary computational costs. To overcome these limitations, this paper introduces a poly-encoder BERT-based model with batch contrastive learning for WSD, named PolyBERT. Compared with previous WSD methods, PolyBERT has two improvements: (1) A poly-encoder with a multi-head attention mechanism is utilized to fuse token-level (local) and sequence-level (global) semantics, rather than focusing on just one. This approach enriches semantic representation by balancing local and global semantics. (2) To avoid redundant training inputs, Batch Contrastive Learning (BCL) is introduced. BCL utilizes the correct senses of other target words in the same batch as negative samples for the current target word, which reduces training inputs and computational cost. The experimental results demonstrate that PolyBERT outperforms baseline WSD methods such as Huang's GlossBERT and Blevins's BEM by 2\% in F1-score. In addition, PolyBERT with BCL reduces GPU hours by 37.6\% compared with PolyBERT without BCL.
format Preprint
id arxiv_https___arxiv_org_abs_2506_00968
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle PolyBERT: Fine-Tuned Poly Encoder BERT-Based Model for Word Sense Disambiguation
Xia, Linhan
Yang, Mingzhan
Yuan, Guohui
Tao, Shengnan
Qiu, Yujing
Yu, Guo
Lei, Kai
Artificial Intelligence
Mainstream Word Sense Disambiguation (WSD) approaches have employed BERT to extract semantics from both context and definitions of senses to determine the most suitable sense of a target word, achieving notable performance. However, there are two limitations in these approaches. First, previous studies failed to balance the representation of token-level (local) and sequence-level (global) semantics during feature extraction, leading to insufficient semantic representation and a performance bottleneck. Second, these approaches incorporated all possible senses of each target word during the training phase, leading to unnecessary computational costs. To overcome these limitations, this paper introduces a poly-encoder BERT-based model with batch contrastive learning for WSD, named PolyBERT. Compared with previous WSD methods, PolyBERT has two improvements: (1) A poly-encoder with a multi-head attention mechanism is utilized to fuse token-level (local) and sequence-level (global) semantics, rather than focusing on just one. This approach enriches semantic representation by balancing local and global semantics. (2) To avoid redundant training inputs, Batch Contrastive Learning (BCL) is introduced. BCL utilizes the correct senses of other target words in the same batch as negative samples for the current target word, which reduces training inputs and computational cost. The experimental results demonstrate that PolyBERT outperforms baseline WSD methods such as Huang's GlossBERT and Blevins's BEM by 2\% in F1-score. In addition, PolyBERT with BCL reduces GPU hours by 37.6\% compared with PolyBERT without BCL.
title PolyBERT: Fine-Tuned Poly Encoder BERT-Based Model for Word Sense Disambiguation
topic Artificial Intelligence
url https://arxiv.org/abs/2506.00968