Guardado en:
Detalles Bibliográficos
Autores principales: Tan, Xingwei, Lyu, Chen, Umer, Hafiz Muhammad, Khan, Sahrish, Parvatham, Mahathi, Arthurs, Lois, Cullen, Simon, Wilson, Shelley, Jhumka, Arshad, Pergola, Gabriele
Formato: Preprint
Publicado: 2025
Materias:
Acceso en línea:https://arxiv.org/abs/2503.06534
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866909531925118976
author Tan, Xingwei
Lyu, Chen
Umer, Hafiz Muhammad
Khan, Sahrish
Parvatham, Mahathi
Arthurs, Lois
Cullen, Simon
Wilson, Shelley
Jhumka, Arshad
Pergola, Gabriele
author_facet Tan, Xingwei
Lyu, Chen
Umer, Hafiz Muhammad
Khan, Sahrish
Parvatham, Mahathi
Arthurs, Lois
Cullen, Simon
Wilson, Shelley
Jhumka, Arshad
Pergola, Gabriele
contents Detecting toxic language including sexism, harassment and abusive behaviour, remains a critical challenge, particularly in its subtle and context-dependent forms. Existing approaches largely focus on isolated message-level classification, overlooking toxicity that emerges across conversational contexts. To promote and enable future research in this direction, we introduce SafeSpeech, a comprehensive platform for toxic content detection and analysis that bridges message-level and conversation-level insights. The platform integrates fine-tuned classifiers and large language models (LLMs) to enable multi-granularity detection, toxic-aware conversation summarization, and persona profiling. SafeSpeech also incorporates explainability mechanisms, such as perplexity gain analysis, to highlight the linguistic elements driving predictions. Evaluations on benchmark datasets, including EDOS, OffensEval, and HatEval, demonstrate the reproduction of state-of-the-art performance across multiple tasks, including fine-grained sexism detection.
format Preprint
id arxiv_https___arxiv_org_abs_2503_06534
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle SafeSpeech: A Comprehensive and Interactive Tool for Analysing Sexist and Abusive Language in Conversations
Tan, Xingwei
Lyu, Chen
Umer, Hafiz Muhammad
Khan, Sahrish
Parvatham, Mahathi
Arthurs, Lois
Cullen, Simon
Wilson, Shelley
Jhumka, Arshad
Pergola, Gabriele
Computation and Language
Detecting toxic language including sexism, harassment and abusive behaviour, remains a critical challenge, particularly in its subtle and context-dependent forms. Existing approaches largely focus on isolated message-level classification, overlooking toxicity that emerges across conversational contexts. To promote and enable future research in this direction, we introduce SafeSpeech, a comprehensive platform for toxic content detection and analysis that bridges message-level and conversation-level insights. The platform integrates fine-tuned classifiers and large language models (LLMs) to enable multi-granularity detection, toxic-aware conversation summarization, and persona profiling. SafeSpeech also incorporates explainability mechanisms, such as perplexity gain analysis, to highlight the linguistic elements driving predictions. Evaluations on benchmark datasets, including EDOS, OffensEval, and HatEval, demonstrate the reproduction of state-of-the-art performance across multiple tasks, including fine-grained sexism detection.
title SafeSpeech: A Comprehensive and Interactive Tool for Analysing Sexist and Abusive Language in Conversations
topic Computation and Language
url https://arxiv.org/abs/2503.06534