Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Hui, Zheng, Guo, Zhaoxiao, Zhao, Hang, Duan, Juanyong, Huang, Congrui
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2409.14740
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913793901068288
author	Hui, Zheng Guo, Zhaoxiao Zhao, Hang Duan, Juanyong Huang, Congrui
author_facet	Hui, Zheng Guo, Zhaoxiao Zhao, Hang Duan, Juanyong Huang, Congrui
contents	In different NLP tasks, detecting harmful content is crucial for online environments, especially with the growing influence of social media. However, previous research has two main issues: 1) a lack of data in low-resource settings, and 2) inconsistent definitions and criteria for judging harmful content, requiring classification models to be robust to spurious features and diverse. We propose Toxicraft, a novel framework for synthesizing datasets of harmful information to address these weaknesses. With only a small amount of seed data, our framework can generate a wide variety of synthetic, yet remarkably realistic, examples of toxic information. Experimentation across various datasets showcases a notable enhancement in detection model robustness and adaptability, surpassing or close to the gold labels.
format	Preprint
id	arxiv_https___arxiv_org_abs_2409_14740
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information Hui, Zheng Guo, Zhaoxiao Zhao, Hang Duan, Juanyong Huang, Congrui Computation and Language Artificial Intelligence In different NLP tasks, detecting harmful content is crucial for online environments, especially with the growing influence of social media. However, previous research has two main issues: 1) a lack of data in low-resource settings, and 2) inconsistent definitions and criteria for judging harmful content, requiring classification models to be robust to spurious features and diverse. We propose Toxicraft, a novel framework for synthesizing datasets of harmful information to address these weaknesses. With only a small amount of seed data, our framework can generate a wide variety of synthetic, yet remarkably realistic, examples of toxic information. Experimentation across various datasets showcases a notable enhancement in detection model robustness and adaptability, surpassing or close to the gold labels.
title	ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2409.14740

Similar Items