:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chen, Jennifer L., Ladhak, Faisal, Li, Daniel, Elhadad, Noémie
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2411.06213
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SPEER: Sentence-Level Planning of Long Clinical Summaries via Embedded Entity Retrieval
by: Adams, Griffin, et al.
Published: (2024)

HateXScore: A Metric Suite for Evaluating Reasoning Quality in Hate Speech Explanations
by: Hu, Yujia, et al.
Published: (2026)

An Effective, Robust and Fairness-aware Hate Speech Detection Framework
by: Mou, Guanyi, et al.
Published: (2024)

HateDebias: On the Diversity and Variability of Hate Speech Debiasing
by: Wu, Hongyan, et al.
Published: (2024)

Dual-Class Prompt Generation: Enhancing Indonesian Gender-Based Hate Speech Detection through Data Augmentation
by: Ibrahim, Muhammad Amien, et al.
Published: (2025)

"Is Hate Lost in Translation?": Evaluation of Multilingual LGBTQIA+ Hate Speech Detection
by: Chan, Fai Leui, et al.
Published: (2024)

Bridging Fairness and Explainability: Can Input-Based Explanations Promote Fairness in Hate Speech Detection?
by: Wang, Yifan, et al.
Published: (2025)

Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech
by: Yadav, Neemesh, et al.
Published: (2024)

Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster
by: Calabrese, Agostina, et al.
Published: (2024)

PEACE 2.0: Grounded Explanations and Counter-Speech for Combating Hate Expressions
by: Damo, Greta, et al.
Published: (2026)

SAFE-MEME: Structured Reasoning Framework for Robust Hate Speech Detection in Memes
by: Nandi, Palash, et al.
Published: (2024)

Aligning Large Language Models via Fine-grained Supervision
by: Xu, Dehong, et al.
Published: (2024)

Hateful Person or Hateful Model? Investigating the Role of Personas in Hate Speech Detection by Large Language Models
by: Yuan, Shuzhou, et al.
Published: (2025)

MasonPerplexity at Multimodal Hate Speech Event Detection 2024: Hate Speech and Target Detection Using Transformer Ensembles
by: Ganguly, Amrita, et al.
Published: (2024)

Aligning Attention with Human Rationales for Self-Explaining Hate Speech Detection
by: Eilertsen, Brage, et al.
Published: (2025)

Compositional Generalisation for Explainable Hate Speech Detection
by: Calabrese, Agostina, et al.
Published: (2025)

Automatic Textual Normalization for Hate Speech Detection
by: Nguyen, Anh Thi-Hoang, et al.
Published: (2023)

Advancing Hate Speech Detection with Transformers: Insights from the MetaHate
by: Chapagain, Santosh, et al.
Published: (2025)

L0-Reasoning Bench: Evaluating Procedural Correctness in Language Models via Simple Program Execution
by: Sun, Simeng, et al.
Published: (2025)

Algorithmic Fairness in NLP: Persona-Infused LLMs for Human-Centric Hate Speech Detection
by: Gajewska, Ewelina, et al.
Published: (2025)

NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data
by: Tonneau, Manuel, et al.
Published: (2024)

When Hate Meets Facts: LLMs-in-the-Loop for Check-worthiness Detection in Hate Speech
by: Ocampo, Nicolás Benjamín, et al.
Published: (2026)

MetaHate: A Dataset for Unifying Efforts on Hate Speech Detection
by: Piot, Paloma, et al.
Published: (2024)

Hate Speech Detection with Generalizable Target-aware Fairness
by: Chen, Tong, et al.
Published: (2024)

HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection
by: Proskurina, Irina, et al.
Published: (2025)

Hate Speech According to the Law: An Analysis for Effective Detection
by: Korre, Katerina, et al.
Published: (2024)

Self-Explaining Hate Speech Detection with Moral Rationales
by: Vargas, Francielle, et al.
Published: (2026)

Code-Mixed Telugu-English Hate Speech Detection
by: Kakarla, Santhosh, et al.
Published: (2025)

GPT-HateCheck: Can LLMs Write Better Functional Tests for Hate Speech Detection?
by: Jin, Yiping, et al.
Published: (2024)

Multi3Hate: Multimodal, Multilingual, and Multicultural Hate Speech Detection with Vision-Language Models
by: Bui, Minh Duc, et al.
Published: (2024)

Disagreeing Rationales: Rethinking Classification and Explainability Evaluation in Hate Speech Detection
by: Muscato, Benedetta, et al.
Published: (2026)

HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models
by: Nghiem, Huy, et al.
Published: (2024)

HateTinyLLM : Hate Speech Detection Using Tiny Large Language Models
by: Sen, Tanmay, et al.
Published: (2024)

EkoHate: Abusive Language and Hate Speech Detection for Code-switched Political Discussions on Nigerian Twitter
by: Ilevbare, Comfort Eseohen, et al.
Published: (2024)

Decoding Hate: Exploring Language Models' Reactions to Hate Speech
by: Piot, Paloma, et al.
Published: (2024)

X-MuTeST: A Multilingual Benchmark for Explainable Hate Speech Detection and A Novel LLM-consulted Explanation Framework
by: Rehman, Mohammad Zia Ur, et al.
Published: (2026)

Towards Fairness Assessment of Dutch Hate Speech Detection
by: Bauer, Julie, et al.
Published: (2025)

Probing Critical Learning Dynamics of PLMs for Hate Speech Detection
by: Masud, Sarah, et al.
Published: (2024)

xList-Hate: A Checklist-Based Framework for Interpretable and Generalizable Hate Speech Detection
by: Girón, Adrián, et al.
Published: (2026)

Cracking the Code: Enhancing Implicit Hate Speech Detection through Coding Classification
by: Wei, Lu, et al.
Published: (2025)