Saved in:
Bibliographic Details
Main Authors: Hui, Zheng, Guo, Zhaoxiao, Zhao, Hang, Duan, Juanyong, Huang, Congrui
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.14740
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913793901068288
author Hui, Zheng
Guo, Zhaoxiao
Zhao, Hang
Duan, Juanyong
Huang, Congrui
author_facet Hui, Zheng
Guo, Zhaoxiao
Zhao, Hang
Duan, Juanyong
Huang, Congrui
contents In different NLP tasks, detecting harmful content is crucial for online environments, especially with the growing influence of social media. However, previous research has two main issues: 1) a lack of data in low-resource settings, and 2) inconsistent definitions and criteria for judging harmful content, requiring classification models to be robust to spurious features and diverse. We propose Toxicraft, a novel framework for synthesizing datasets of harmful information to address these weaknesses. With only a small amount of seed data, our framework can generate a wide variety of synthetic, yet remarkably realistic, examples of toxic information. Experimentation across various datasets showcases a notable enhancement in detection model robustness and adaptability, surpassing or close to the gold labels.
format Preprint
id arxiv_https___arxiv_org_abs_2409_14740
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information
Hui, Zheng
Guo, Zhaoxiao
Zhao, Hang
Duan, Juanyong
Huang, Congrui
Computation and Language
Artificial Intelligence
In different NLP tasks, detecting harmful content is crucial for online environments, especially with the growing influence of social media. However, previous research has two main issues: 1) a lack of data in low-resource settings, and 2) inconsistent definitions and criteria for judging harmful content, requiring classification models to be robust to spurious features and diverse. We propose Toxicraft, a novel framework for synthesizing datasets of harmful information to address these weaknesses. With only a small amount of seed data, our framework can generate a wide variety of synthetic, yet remarkably realistic, examples of toxic information. Experimentation across various datasets showcases a notable enhancement in detection model robustness and adaptability, surpassing or close to the gold labels.
title ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information
topic Computation and Language
Artificial Intelligence
url https://arxiv.org/abs/2409.14740