Saved in:
Bibliographic Details
Main Authors: Xu, Yinuo, Jurgens, David
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.09065
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914260023508992
author Xu, Yinuo
Jurgens, David
author_facet Xu, Yinuo
Jurgens, David
contents Annotator disagreement is widespread in NLP, particularly for subjective and ambiguous tasks such as toxicity detection and stance analysis. While early approaches treated disagreement as noise to be removed, recent work increasingly models it as a meaningful signal reflecting variation in interpretation and perspective. This survey provides a unified view of disagreement-aware NLP methods. We first present a domain-agnostic taxonomy of the sources of disagreement spanning data, task, and annotator factors. We then synthesize modeling approaches using a common framework defined by prediction targets and pooling structure, highlighting a shift from consensus learning toward explicitly modeling disagreement, and toward capturing structured relationships among annotators. We review evaluation metrics for both predictive performance and annotator behavior, and noting that most fairness evaluations remain descriptive rather than normative. We conclude by identifying open challenges and future directions, including integrating multiple sources of variation, developing disagreement-aware interpretability frameworks, and grappling with the practical tradeoffs of perspectivist modeling.
format Preprint
id arxiv_https___arxiv_org_abs_2601_09065
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Beyond Consensus: Perspectivist Modeling and Evaluation of Annotator Disagreement in NLP
Xu, Yinuo
Jurgens, David
Computation and Language
Annotator disagreement is widespread in NLP, particularly for subjective and ambiguous tasks such as toxicity detection and stance analysis. While early approaches treated disagreement as noise to be removed, recent work increasingly models it as a meaningful signal reflecting variation in interpretation and perspective. This survey provides a unified view of disagreement-aware NLP methods. We first present a domain-agnostic taxonomy of the sources of disagreement spanning data, task, and annotator factors. We then synthesize modeling approaches using a common framework defined by prediction targets and pooling structure, highlighting a shift from consensus learning toward explicitly modeling disagreement, and toward capturing structured relationships among annotators. We review evaluation metrics for both predictive performance and annotator behavior, and noting that most fairness evaluations remain descriptive rather than normative. We conclude by identifying open challenges and future directions, including integrating multiple sources of variation, developing disagreement-aware interpretability frameworks, and grappling with the practical tradeoffs of perspectivist modeling.
title Beyond Consensus: Perspectivist Modeling and Evaluation of Annotator Disagreement in NLP
topic Computation and Language
url https://arxiv.org/abs/2601.09065