Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Hui, Zheng, Guo, Zhaoxiao, Zhao, Hang, Duan, Juanyong, Huang, Congrui
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2409.14740
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

In different NLP tasks, detecting harmful content is crucial for online environments, especially with the growing influence of social media. However, previous research has two main issues: 1) a lack of data in low-resource settings, and 2) inconsistent definitions and criteria for judging harmful content, requiring classification models to be robust to spurious features and diverse. We propose Toxicraft, a novel framework for synthesizing datasets of harmful information to address these weaknesses. With only a small amount of seed data, our framework can generate a wide variety of synthetic, yet remarkably realistic, examples of toxic information. Experimentation across various datasets showcases a notable enhancement in detection model robustness and adaptability, surpassing or close to the gold labels.

Similar Items