:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Smywiński-Pohl, Aleksander, Libal, Tomer, Kaczmarczyk, Adam, Król, Magdalena
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2506.13965
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

LLM-as-a-Judge is Bad, Based on AI Attempting the Exam Qualifying for the Member of the Polish National Board of Appeal
by: Karp, Michał, et al.
Published: (2025)

Model-Aware Tokenizer Transfer
by: Haltiuk, Mykola, et al.
Published: (2025)

Targum -- A Multilingual New Testament Translation Corpus
by: Rapacz, Maciej, et al.
Published: (2026)

eFontes. Part of Speech Tagging and Lemmatization of Medieval Latin Texts.A Cross-Genre Survey
by: Nowak, Krzysztof, et al.
Published: (2024)

Cognitive models can reveal interpretable value trade-offs in language models
by: Murthy, Sonia K., et al.
Published: (2025)

LLM_annotate: A Python package for annotating and analyzing fiction characters
by: Rosenbusch, Hannes
Published: (2025)

The Illusion-Illusion: Vision Language Models See Illusions Where There are None
by: Ullman, Tomer
Published: (2024)

LLMs for automatic annotation of Mandarin narrative transcripts
by: Zhao, Qingwen, et al.
Published: (2026)

Recovering document annotations for sentence-level bitext
by: Wicks, Rachel, et al.
Published: (2024)

When retrieval outperforms generation: Dense evidence retrieval for scalable fake news detection
by: Qazi, Alamgir Munir, et al.
Published: (2025)

Towards a Principled Evaluation of Knowledge Editors
by: Pohl, Sebastian, et al.
Published: (2025)

Are complicated loss functions necessary for teaching LLMs to reason?
by: Carrino, Gabriele, et al.
Published: (2026)

Annotation alignment: Comparing LLM and human annotations of conversational safety
by: Movva, Rajiv, et al.
Published: (2024)

Coconstructions in spoken data: UD annotation guidelines and first results
by: Pannitto, Ludovica, et al.
Published: (2026)

A framework for annotating and modelling intentions behind metaphor use
by: Michelli, Gianluca, et al.
Published: (2024)

Are generative AI text annotations systematically biased?
by: Stolwijk, Sjoerd B., et al.
Published: (2025)

Can sparse autoencoders be used to decompose and interpret steering vectors?
by: Mayne, Harry, et al.
Published: (2024)

Shades of Zero: Distinguishing Impossibility from Inconceivability
by: Hu, Jennifer, et al.
Published: (2025)

How much speech data is necessary for ASR in African languages? An evaluation of data scaling in Kinyarwanda and Kikuyu
by: Akera, Benjamin, et al.
Published: (2025)

LM-PUB-QUIZ: A Comprehensive Framework for Zero-Shot Evaluation of Relational Knowledge in Language Models
by: Ploner, Max, et al.
Published: (2024)

PersonaFeedback: A Large-scale Human-annotated Benchmark For Personalization
by: Tao, Meiling, et al.
Published: (2025)

A multitask learning framework for leveraging subjectivity of annotators to identify misogyny
by: Angel, Jason, et al.
Published: (2024)

Exploring transfer learning for Deep NLP systems on rarely annotated languages
by: Yadav, Dipendra, et al.
Published: (2024)

A review of annotation classification tools in the educational domain
by: Gayoso-Cabada, Joaquín, et al.
Published: (2025)

Scalable multilingual PII annotation for responsible AI in LLMs
by: Meena, Bharti, et al.
Published: (2025)

Large language models struggle with ethnographic text annotation
by: Goodall, Leonardo S., et al.
Published: (2026)

One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity
by: Murthy, Sonia K., et al.
Published: (2024)

ELITE: Embedding-Less retrieval with Iterative Text Exploration
by: Wang, Zhangyu, et al.
Published: (2025)

Transformer verbatim in-context retrieval across time and scale
by: Armeni, Kristijan, et al.
Published: (2024)

Inroads to a Structured Data Natural Language Bijection and the role of LLM annotation
by: Vente, Blake
Published: (2024)

The Coverage Illusion: From Pre-retrieval Routing Failure to Post-retrieval Cascades in a Production RAG System
by: Hussain, Zafar, et al.
Published: (2026)

EmoNet-Voice: A Fine-Grained, Expert-Verified Benchmark for Speech Emotion Detection
by: Schuhmann, Christoph, et al.
Published: (2025)

ParaRev: Building a dataset for Scientific Paragraph Revision annotated with revision instruction
by: Jourdan, Léane, et al.
Published: (2025)

Constraining constructions with WordNet: pros and cons for the semantic annotation of fillers in the Italian Constructicon
by: Pisciotta, Flavio, et al.
Published: (2025)

Understanding the effects of word-level linguistic annotations in under-resourced neural machine translation
by: Sánchez-Cartagena, Víctor M., et al.
Published: (2024)

"You are an expert annotator": Automatic Best-Worst-Scaling Annotations for Emotion Intensity Modeling
by: Bagdon, Christopher, et al.
Published: (2024)

Large corpora and large language models: a replicable method for automating grammatical annotation
by: Morin, Cameron, et al.
Published: (2024)

Counting on Consensus: Selecting the Right Inter-annotator Agreement Metric for NLP Annotation and Evaluation
by: James, Joseph
Published: (2026)

Parser agreement and disagreement in L2 Korean UD: Implications for human-in-the-loop annotation
by: Sung, Hakyung, et al.
Published: (2026)

Researchers waste 80% of LLM annotation costs by classifying one text at a time
by: Pipal, Christian, et al.
Published: (2026)