Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Kim, Jaehoon, Jin, Seungwan, Park, Sohyun, Park, Someen, Han, Kyungsik
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2406.07886
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929382733381632
author	Kim, Jaehoon Jin, Seungwan Park, Sohyun Park, Someen Han, Kyungsik
author_facet	Kim, Jaehoon Jin, Seungwan Park, Sohyun Park, Someen Han, Kyungsik
contents	Detecting implicit hate speech that is not directly hateful remains a challenge. Recent research has attempted to detect implicit hate speech by applying contrastive learning to pre-trained language models such as BERT and RoBERTa, but the proposed models still do not have a significant advantage over cross-entropy loss-based learning. We found that contrastive learning based on randomly sampled batch data does not encourage the model to learn hard negative samples. In this work, we propose Label-aware Hard Negative sampling strategies (LAHN) that encourage the model to learn detailed features from hard negative samples, instead of naive negative samples in random batch, using momentum-integrated contrastive learning. LAHN outperforms the existing models for implicit hate speech detection both in- and cross-datasets. The code is available at https://github.com/Hanyang-HCC-Lab/LAHN
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_07886
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection Kim, Jaehoon Jin, Seungwan Park, Sohyun Park, Someen Han, Kyungsik Computation and Language Detecting implicit hate speech that is not directly hateful remains a challenge. Recent research has attempted to detect implicit hate speech by applying contrastive learning to pre-trained language models such as BERT and RoBERTa, but the proposed models still do not have a significant advantage over cross-entropy loss-based learning. We found that contrastive learning based on randomly sampled batch data does not encourage the model to learn hard negative samples. In this work, we propose Label-aware Hard Negative sampling strategies (LAHN) that encourage the model to learn detailed features from hard negative samples, instead of naive negative samples in random batch, using momentum-integrated contrastive learning. LAHN outperforms the existing models for implicit hate speech detection both in- and cross-datasets. The code is available at https://github.com/Hanyang-HCC-Lab/LAHN
title	Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection
topic	Computation and Language
url	https://arxiv.org/abs/2406.07886

Similar Items