:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Pipal, Christian, Vogel, Eva-Maria, Wack, Morgan, Esser, Frank
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2604.03684
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Are generative AI text annotations systematically biased?
by: Stolwijk, Sjoerd B., et al.
Published: (2025)

Is text normalization relevant for classifying medieval charters?
by: Atzenhofer-Baumgartner, Florian, et al.
Published: (2024)

Large language models struggle with ethnographic text annotation
by: Goodall, Leonardo S., et al.
Published: (2026)

Specialized text classification: an approach to classifying Open Banking transactions
by: TA, Duc Tuyen, et al.
Published: (2025)

LLM_annotate: A Python package for annotating and analyzing fiction characters
by: Rosenbusch, Hannes
Published: (2025)

Tgea: An error-annotated dataset and benchmark tasks for text generation from pretrained language models
by: He, Jie, et al.
Published: (2025)

Natural language guidance of high-fidelity text-to-speech with synthetic annotations
by: Lyth, Dan, et al.
Published: (2024)

Application of CARE-SD text classifier tools to assess distribution of stigmatizing and doubt-marking language features in EHR
by: Walker, Drew, et al.
Published: (2025)

Taec: a Manually annotated text dataset for trait and phenotype extraction and entity linking in wheat breeding literature
by: Nédellec, Claire, et al.
Published: (2024)

Annotation alignment: Comparing LLM and human annotations of conversational safety
by: Movva, Rajiv, et al.
Published: (2024)

Inroads to a Structured Data Natural Language Bijection and the role of LLM annotation
by: Vente, Blake
Published: (2024)

ThangDLU at #SMM4H 2024: Encoder-decoder models for classifying text data on social disorders in children and adolescents
by: Ta, Hoang-Thang, et al.
Published: (2024)

Peacemaker at ATE-IT: Automatic term extraction from Italian text for waste management data using encoder model
by: Bakhtiyarzadeh, Mahdi, et al.
Published: (2026)

Evaluating how LLM annotations represent diverse views on contentious topics
by: Brown, Megan A., et al.
Published: (2025)

LlamBERT: Large-scale low-cost data annotation in NLP
by: Csanády, Bálint, et al.
Published: (2024)

Benchmark of stylistic variation in LLM-generated texts
by: Milička, Jiří, et al.
Published: (2025)

ReZero: Enhancing LLM search ability by trying one-more-time
by: Dao, Alan, et al.
Published: (2025)

Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation
by: Cegin, Jan, et al.
Published: (2024)

Synthetically generated text for supervised text analysis
by: Halterman, Andrew
Published: (2023)

AgoraSpeech: A multi-annotated comprehensive dataset of political discourse through the lens of humans and AI
by: Sermpezis, Pavlos, et al.
Published: (2025)

Humans can learn to detect AI-generated texts, or at least learn when they can't
by: Milička, Jiří, et al.
Published: (2025)

Less than one percent of words would be affected by gender-inclusive language in German press texts
by: Müller-Spitzer, Carolin, et al.
Published: (2024)

LLMs for automatic annotation of Mandarin narrative transcripts
by: Zhao, Qingwen, et al.
Published: (2026)

Recovering document annotations for sentence-level bitext
by: Wicks, Rachel, et al.
Published: (2024)

Are manual annotations necessary for statutory interpretations retrieval?
by: Smywiński-Pohl, Aleksander, et al.
Published: (2025)

Assessing the potential of LLM-assisted annotation for corpus-based pragmatics and discourse analysis: The case of apology
by: Yu, Danni, et al.
Published: (2023)

Noise-Aware Named Entity Recognition for Historical VET Documents
by: Esser, Alexander M., et al.
Published: (2026)

CheckEval: A reliable LLM-as-a-Judge framework for evaluating text generation using checklists
by: Lee, Yukyung, et al.
Published: (2024)

The Laziness of the Crowd: Effort Aversion Among Raters Risks Undermining the Efficacy of X's Community Notes Program
by: Wack, Morgan, et al.
Published: (2026)

Comparing LLM-generated and human-authored news text using formal syntactic theory
by: Zamaraeva, Olga, et al.
Published: (2025)

LLM-based feature generation from text for interpretable machine learning
by: Balek, Vojtěch, et al.
Published: (2024)

negativas: a prototype for searching and classifying sentential negation in speech data
by: de Gois, Túlio Sousa, et al.
Published: (2025)

LLM-as-classifier: Semi-Supervised, Iterative Framework for Hierarchical Text Classification using Large Language Models
by: You, Doohee, et al.
Published: (2025)

Under-resourced studies of under-resourced languages: lemmatization and POS-tagging with LLM annotators for historical Armenian, Georgian, Greek and Syriac
by: Vidal-Gorène, Chahan, et al.
Published: (2026)

Political Fact-Checking Efforts are Constrained by Deficiencies in Coverage, Speed, and Reach
by: Wack, Morgan, et al.
Published: (2024)

Building a Family of Data Augmentation Models for Low-cost LLM Fine-tuning on the Cloud
by: Yue, Yuanhao, et al.
Published: (2024)

Kill two birds with one stone: generalized and robust AI-generated text detection via dynamic perturbations
by: Zhou, Yinghan, et al.
Published: (2025)

Stylometry recognizes human and LLM-generated texts in short samples
by: Przystalski, Karol, et al.
Published: (2025)

Beyond speculation: Measuring the growing presence of LLM-generated texts in multilingual disinformation
by: Macko, Dominik, et al.
Published: (2025)

Sentiment analysis and random forest to classify LLM versus human source applied to Scientific Texts
by: Sanchez-Medina, Javier J.
Published: (2024)