Saved in:
| Main Authors: | Steen, Julius, Markert, Katja |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2309.08047 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Whose Facts Win? LLM Source Preferences under Knowledge Conflicts
by: Schuster, Jakob, et al.
Published: (2026)
by: Schuster, Jakob, et al.
Published: (2026)
Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora
by: Derner, Erik, et al.
Published: (2024)
by: Derner, Erik, et al.
Published: (2024)
New Textual Corpora for Serbian Language Modeling
by: Škorić, Mihailo, et al.
Published: (2024)
by: Škorić, Mihailo, et al.
Published: (2024)
Comparable Corpora: Opportunities for New Research Directions
by: Church, Kenneth
Published: (2025)
by: Church, Kenneth
Published: (2025)
A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias
by: Xu, Yuemei, et al.
Published: (2024)
by: Xu, Yuemei, et al.
Published: (2024)
On Positional Bias of Faithfulness for Long-form Summarization
by: Wan, David, et al.
Published: (2024)
by: Wan, David, et al.
Published: (2024)
Pitfalls of Conversational LLMs on News Debiasing
by: Schlicht, Ipek Baris, et al.
Published: (2024)
by: Schlicht, Ipek Baris, et al.
Published: (2024)
Text Corpora as Concept Fields: Black-Box Hallucination and Novelty Measurement
by: Kersting, Nicholas S., et al.
Published: (2026)
by: Kersting, Nicholas S., et al.
Published: (2026)
From Outliers to Topics in Language Models: Anticipating Trends in News Corpora
by: Zve, Evangelia, et al.
Published: (2025)
by: Zve, Evangelia, et al.
Published: (2025)
AdvSumm: Adversarial Training for Bias Mitigation in Text Summarization
by: Gupta, Mukur, et al.
Published: (2025)
by: Gupta, Mukur, et al.
Published: (2025)
The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case Study on Media Bias Detection
by: Horych, Tomas, et al.
Published: (2024)
by: Horych, Tomas, et al.
Published: (2024)
Uncertainty Quantification for Evaluating Machine Translation Bias
by: Staliūnaitė, Ieva Raminta, et al.
Published: (2025)
by: Staliūnaitė, Ieva Raminta, et al.
Published: (2025)
Is This Collection Worth My LLM's Time? Automatically Measuring Information Potential in Text Corpora
by: Karch, Tristan, et al.
Published: (2025)
by: Karch, Tristan, et al.
Published: (2025)
Generalization Bias in Large Language Model Summarization of Scientific Research
by: Peters, Uwe, et al.
Published: (2025)
by: Peters, Uwe, et al.
Published: (2025)
Reap the Wild Wind: Detecting Media Storms in Large-Scale News Corpora
by: Markus, Dror K., et al.
Published: (2024)
by: Markus, Dror K., et al.
Published: (2024)
Identifying Emerging Concepts in Large Corpora
by: Ma, Sibo, et al.
Published: (2025)
by: Ma, Sibo, et al.
Published: (2025)
Validating and Exploring Large Geographic Corpora
by: Dunn, Jonathan
Published: (2024)
by: Dunn, Jonathan
Published: (2024)
The Pitfalls of Defining Hallucination
by: van Deemter, Kees
Published: (2024)
by: van Deemter, Kees
Published: (2024)
EquiSumm : A Gender Bias-Aware Framework for Inclusive Tweet Summarization
by: Wanjari, Chaitanya, et al.
Published: (2026)
by: Wanjari, Chaitanya, et al.
Published: (2026)
The Promises and Pitfalls of Using Language Models to Measure Instruction Quality in Education
by: Xu, Paiheng, et al.
Published: (2024)
by: Xu, Paiheng, et al.
Published: (2024)
Mathematical Entities: Corpora and Benchmarks
by: Collard, Jacob, et al.
Published: (2024)
by: Collard, Jacob, et al.
Published: (2024)
Understanding Position Bias Effects on Fairness in Social Multi-Document Summarization
by: Olabisi, Olubusayo, et al.
Published: (2024)
by: Olabisi, Olubusayo, et al.
Published: (2024)
Pitfalls and Outlooks in Using COMET
by: Zouhar, Vilém, et al.
Published: (2024)
by: Zouhar, Vilém, et al.
Published: (2024)
The Growing Gains and Pains of Iterative Web Corpora Crawling: Insights from South Slavic CLASSLA-web 2.0 Corpora
by: Pungeršek, Taja Kuzman, et al.
Published: (2026)
by: Pungeršek, Taja Kuzman, et al.
Published: (2026)
Active Learning for Multilingual Fingerspelling Corpora
by: Wang, Shuai, et al.
Published: (2023)
by: Wang, Shuai, et al.
Published: (2023)
AI Brown and AI Koditex: LLM-Generated Corpora Comparable to Traditional Corpora of English and Czech Texts
by: Milička, Jiří, et al.
Published: (2025)
by: Milička, Jiří, et al.
Published: (2025)
Building and Aligning Comparable Corpora
by: Saad, Motaz, et al.
Published: (2025)
by: Saad, Motaz, et al.
Published: (2025)
Decoding News Bias: Multi Bias Detection in News Articles
by: Shah, Bhushan Santosh, et al.
Published: (2025)
by: Shah, Bhushan Santosh, et al.
Published: (2025)
Guylingo: The Republic of Guyana Creole Corpora
by: Clarke, Christopher, et al.
Published: (2024)
by: Clarke, Christopher, et al.
Published: (2024)
Unsupervised Location Mapping for Narrative Corpora
by: Wagner, Eitan, et al.
Published: (2025)
by: Wagner, Eitan, et al.
Published: (2025)
Trustworthy Social Bias Measurement
by: Bommasani, Rishi, et al.
Published: (2022)
by: Bommasani, Rishi, et al.
Published: (2022)
Measuring Grammatical Diversity from Small Corpora: Derivational Entropy Rates, Mean Length of Utterances, and Annotation Invariance
by: Martin, Fermin Moscoso del Prado
Published: (2024)
by: Martin, Fermin Moscoso del Prado
Published: (2024)
Headline-Guided Extractive Summarization for Thai News Articles
by: Kositcharoensuk, Pimpitchaya, et al.
Published: (2024)
by: Kositcharoensuk, Pimpitchaya, et al.
Published: (2024)
Wiki Dumps to Training Corpora: South Slavic Case
by: Škorić, Mihailo, et al.
Published: (2026)
by: Škorić, Mihailo, et al.
Published: (2026)
Disambiguating Numeral Sequences to Decipher Ancient Accounting Corpora
by: Born, Logan, et al.
Published: (2025)
by: Born, Logan, et al.
Published: (2025)
When Should Dense Retrievers Be Updated in Evolving Corpora? Detecting Out-of-Distribution Corpora Using GradNormIR
by: Ko, Dayoon, et al.
Published: (2025)
by: Ko, Dayoon, et al.
Published: (2025)
Attacks against Abstractive Text Summarization Models through Lead Bias and Influence Functions
by: Thota, Poojitha, et al.
Published: (2024)
by: Thota, Poojitha, et al.
Published: (2024)
Unraveling the Capabilities of Language Models in News Summarization
by: Odabaşı, Abdurrahman, et al.
Published: (2025)
by: Odabaşı, Abdurrahman, et al.
Published: (2025)
CNsum:Automatic Summarization for Chinese News Text
by: Zhao, Yu, et al.
Published: (2025)
by: Zhao, Yu, et al.
Published: (2025)
DiscoSum: Discourse-aware News Summarization
by: Spangher, Alexander, et al.
Published: (2025)
by: Spangher, Alexander, et al.
Published: (2025)
Similar Items
-
Whose Facts Win? LLM Source Preferences under Knowledge Conflicts
by: Schuster, Jakob, et al.
Published: (2026) -
Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora
by: Derner, Erik, et al.
Published: (2024) -
New Textual Corpora for Serbian Language Modeling
by: Škorić, Mihailo, et al.
Published: (2024) -
Comparable Corpora: Opportunities for New Research Directions
by: Church, Kenneth
Published: (2025) -
A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias
by: Xu, Yuemei, et al.
Published: (2024)