:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Steen, Julius, Markert, Katja
Format:	Preprint
Published:	2023
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2309.08047
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Whose Facts Win? LLM Source Preferences under Knowledge Conflicts
by: Schuster, Jakob, et al.
Published: (2026)

Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora
by: Derner, Erik, et al.
Published: (2024)

New Textual Corpora for Serbian Language Modeling
by: Škorić, Mihailo, et al.
Published: (2024)

Comparable Corpora: Opportunities for New Research Directions
by: Church, Kenneth
Published: (2025)

A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias
by: Xu, Yuemei, et al.
Published: (2024)

On Positional Bias of Faithfulness for Long-form Summarization
by: Wan, David, et al.
Published: (2024)

Pitfalls of Conversational LLMs on News Debiasing
by: Schlicht, Ipek Baris, et al.
Published: (2024)

Text Corpora as Concept Fields: Black-Box Hallucination and Novelty Measurement
by: Kersting, Nicholas S., et al.
Published: (2026)

From Outliers to Topics in Language Models: Anticipating Trends in News Corpora
by: Zve, Evangelia, et al.
Published: (2025)

AdvSumm: Adversarial Training for Bias Mitigation in Text Summarization
by: Gupta, Mukur, et al.
Published: (2025)

The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case Study on Media Bias Detection
by: Horych, Tomas, et al.
Published: (2024)

Uncertainty Quantification for Evaluating Machine Translation Bias
by: Staliūnaitė, Ieva Raminta, et al.
Published: (2025)

Is This Collection Worth My LLM's Time? Automatically Measuring Information Potential in Text Corpora
by: Karch, Tristan, et al.
Published: (2025)

Generalization Bias in Large Language Model Summarization of Scientific Research
by: Peters, Uwe, et al.
Published: (2025)

Reap the Wild Wind: Detecting Media Storms in Large-Scale News Corpora
by: Markus, Dror K., et al.
Published: (2024)

Identifying Emerging Concepts in Large Corpora
by: Ma, Sibo, et al.
Published: (2025)

Validating and Exploring Large Geographic Corpora
by: Dunn, Jonathan
Published: (2024)

The Pitfalls of Defining Hallucination
by: van Deemter, Kees
Published: (2024)

EquiSumm : A Gender Bias-Aware Framework for Inclusive Tweet Summarization
by: Wanjari, Chaitanya, et al.
Published: (2026)

The Promises and Pitfalls of Using Language Models to Measure Instruction Quality in Education
by: Xu, Paiheng, et al.
Published: (2024)

Mathematical Entities: Corpora and Benchmarks
by: Collard, Jacob, et al.
Published: (2024)

Understanding Position Bias Effects on Fairness in Social Multi-Document Summarization
by: Olabisi, Olubusayo, et al.
Published: (2024)

Pitfalls and Outlooks in Using COMET
by: Zouhar, Vilém, et al.
Published: (2024)

The Growing Gains and Pains of Iterative Web Corpora Crawling: Insights from South Slavic CLASSLA-web 2.0 Corpora
by: Pungeršek, Taja Kuzman, et al.
Published: (2026)

Active Learning for Multilingual Fingerspelling Corpora
by: Wang, Shuai, et al.
Published: (2023)

AI Brown and AI Koditex: LLM-Generated Corpora Comparable to Traditional Corpora of English and Czech Texts
by: Milička, Jiří, et al.
Published: (2025)

Building and Aligning Comparable Corpora
by: Saad, Motaz, et al.
Published: (2025)

Decoding News Bias: Multi Bias Detection in News Articles
by: Shah, Bhushan Santosh, et al.
Published: (2025)

Guylingo: The Republic of Guyana Creole Corpora
by: Clarke, Christopher, et al.
Published: (2024)

Unsupervised Location Mapping for Narrative Corpora
by: Wagner, Eitan, et al.
Published: (2025)

Trustworthy Social Bias Measurement
by: Bommasani, Rishi, et al.
Published: (2022)

Measuring Grammatical Diversity from Small Corpora: Derivational Entropy Rates, Mean Length of Utterances, and Annotation Invariance
by: Martin, Fermin Moscoso del Prado
Published: (2024)

Headline-Guided Extractive Summarization for Thai News Articles
by: Kositcharoensuk, Pimpitchaya, et al.
Published: (2024)

Wiki Dumps to Training Corpora: South Slavic Case
by: Škorić, Mihailo, et al.
Published: (2026)

Disambiguating Numeral Sequences to Decipher Ancient Accounting Corpora
by: Born, Logan, et al.
Published: (2025)

When Should Dense Retrievers Be Updated in Evolving Corpora? Detecting Out-of-Distribution Corpora Using GradNormIR
by: Ko, Dayoon, et al.
Published: (2025)

Attacks against Abstractive Text Summarization Models through Lead Bias and Influence Functions
by: Thota, Poojitha, et al.
Published: (2024)

Unraveling the Capabilities of Language Models in News Summarization
by: Odabaşı, Abdurrahman, et al.
Published: (2025)

CNsum:Automatic Summarization for Chinese News Text
by: Zhao, Yu, et al.
Published: (2025)

DiscoSum: Discourse-aware News Summarization
by: Spangher, Alexander, et al.
Published: (2025)