:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kamp, Jonathan, Bakker, Roos, Blok, Dominique
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2512.11108
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The Role of Syntactic Span Preferences in Post-Hoc Explanation Disagreement
by: Kamp, Jonathan, et al.
Published: (2024)

Learning from Sufficient Rationales: Analysing the Relationship Between Explanation Faithfulness and Token-level Regularisation Strategies
by: Kamp, Jonathan, et al.
Published: (2025)

Isolating LLM Lexical Bias: A Curation-Free Triangulated Metric for Preference-Stage Learning
by: Ming, Xiaoyang, et al.
Published: (2026)

Feature Attribution Stability Suite: How Stable Are Post-Hoc Attributions?
by: Subramaniakuppusamy, Kamalasankari, et al.
Published: (2026)

Extracting Lexical Features from Dialects via Interpretable Dialect Classifiers
by: Xie, Roy, et al.
Published: (2024)

Evaluating Evidence Attribution in Generated Fact Checking Explanations
by: Xing, Rui, et al.
Published: (2024)

GraphLSS: Integrating Lexical, Structural, and Semantic Features for Long Document Extractive Summarization
by: Bugueño, Margarita, et al.
Published: (2024)

Self-supervised Attribute-aware Dynamic Preference Ranking Alignment
by: Yang, Hongyu, et al.
Published: (2025)

Who is in the Spotlight: The Hidden Bias Undermining Multimodal Retrieval-Augmented Generation
by: Yao, Jiayu, et al.
Published: (2025)

Finding and Reactivating Post-Trained LLMs' Hidden Safety Mechanisms
by: Li, Mingjie, et al.
Published: (2026)

EvalxNLP: A Framework for Benchmarking Post-Hoc Explainability Methods on NLP Models
by: Dhaini, Mahdi, et al.
Published: (2025)

Mitigation of Gender and Ethnicity Bias in AI-Generated Stories through Model Explanations
by: Dimgba, Martha O., et al.
Published: (2025)

RepreGuard: Detecting LLM-Generated Text by Revealing Hidden Representation Patterns
by: Chen, Xin, et al.
Published: (2025)

Saying the Unsaid: Revealing the Hidden Language of Multimodal Systems Through Telephone Games
by: Zhao, Juntu, et al.
Published: (2025)

Post-edits Are Preferences Too
by: Berger, Nathaniel, et al.
Published: (2024)

Lexicalization Is All You Need: Examining the Impact of Lexical Knowledge in a Compositional QALD System
by: Schmidt, David Maria, et al.
Published: (2024)

Faithfulness Serum: Mitigating the Faithfulness Gap in Textual Explanations of LLM Decisions via Attribution Guidance
by: Alon, Bar, et al.
Published: (2026)

Rate, Explain and Cite (REC): Enhanced Explanation and Attribution in Automatic Evaluation by Large Language Models
by: Hsu, Aliyah R., et al.
Published: (2024)

Not All Preferences are What You Need for Post-Training: Selective Alignment Strategy for Preference Optimization
by: Dong, Zhijin
Published: (2025)

Improving Attributed Text Generation of Large Language Models via Preference Learning
by: Li, Dongfang, et al.
Published: (2024)

Self-Preference Bias in Rubric-Based Evaluation of Large Language Models
by: Pombal, José, et al.
Published: (2026)

Bi-directional Bias Attribution: Debiasing Large Language Models without Modifying Prompts
by: Lin, Yujie, et al.
Published: (2026)

Judging the Judges: A Systematic Study of Position Bias in LLM-as-a-Judge
by: Shi, Lin, et al.
Published: (2024)

Layer-wise Positional Bias in Short-Context Language Modeling
by: Rahimi, Maryam, et al.
Published: (2026)

Language Model Re-rankers are Fooled by Lexical Similarities
by: Hagström, Lovisa, et al.
Published: (2025)

Using Language Models to Disambiguate Lexical Choices in Translation
by: Barua, Josh, et al.
Published: (2024)

Polysemanticity or Polysemy? Lexical Identity Confounds Superposition Metrics
by: Hou, Iyad Ait, et al.
Published: (2026)

Token Homogenization under Positional Bias
by: Yusupov, Viacheslav, et al.
Published: (2025)

Hidden Heroes and Gradient Bloats: Layer-Wise Redundancy Inverts Attribution in Transformers
by: Ye, Donald
Published: (2026)

Post-hoc Reward Calibration: A Case Study on Length Bias
by: Huang, Zeyu, et al.
Published: (2024)

Quantifying and Mitigating Self-Preference Bias of LLM Judges
by: Yang, Jinming, et al.
Published: (2026)

Order Matters: Investigate the Position Bias in Multi-constraint Instruction Following
by: Zeng, Jie, et al.
Published: (2025)

Where is the answer? Investigating Positional Bias in Language Model Knowledge Extraction
by: Saito, Kuniaki, et al.
Published: (2024)

Technical Report: Impact of Position Bias on Language Models in Token Classification
by: Amor, Mehdi Ben, et al.
Published: (2023)

Understanding Position Bias Effects on Fairness in Social Multi-Document Summarization
by: Olabisi, Olubusayo, et al.
Published: (2024)

Unveiling the Lexical Sensitivity of LLMs: Combinatorial Optimization for Prompt Enhancement
by: Zhan, Pengwei, et al.
Published: (2024)

MultiLS: A Multi-task Lexical Simplification Framework
by: North, Kai, et al.
Published: (2024)

VISLA Benchmark: Evaluating Embedding Sensitivity to Semantic and Lexical Alterations
by: Dumpala, Sri Harsha, et al.
Published: (2024)

ALEXSIS-PT: A New Resource for Portuguese Lexical Simplification
by: North, Kai, et al.
Published: (2022)

BARD10: A New Benchmark Reveals Significance of Bangla Stop-Words in Authorship Attribution
by: Moosa, Abdullah Muhammad, et al.
Published: (2025)