:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Nguyen, Van Bach, Seifert, Christin, Schlötterer, Jörg
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2503.04463
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

CEval: A Benchmark for Evaluating Counterfactual Text Generation
by: Nguyen, Van Bach, et al.
Published: (2024)

LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study
by: Nguyen, Van Bach, et al.
Published: (2024)

From Black Boxes to Conversations: Incorporating XAI in a Conversational Agent
by: Nguyen, Van Bach, et al.
Published: (2022)

Persuasion Tokens for Editing Factual Knowledge in LLMs
by: Youssef, Paul, et al.
Published: (2026)

Tracing and Reversing Edits in LLMs
by: Youssef, Paul, et al.
Published: (2025)

How to Make LLMs Forget: On Reversing In-Context Knowledge Edits
by: Youssef, Paul, et al.
Published: (2024)

The Queen of England is not England's Queen: On the Lack of Factual Coherency in PLMs
by: Youssef, Paul, et al.
Published: (2024)

Enhancing Fact Retrieval in PLMs through Truthfulness
by: Youssef, Paul, et al.
Published: (2024)

A Second Look on BASS -- Boosting Abstractive Summarization with Unified Semantic Graphs -- A Replication Study
by: Koraş, Osman Alperen, et al.
Published: (2024)

Has this Fact been Edited? Detecting Knowledge Edits in Language Models
by: Youssef, Paul, et al.
Published: (2024)

Behavioral Analysis of Information Salience in Large Language Models
by: Trienes, Jan, et al.
Published: (2025)

Position: Editing Large Language Models Poses Serious Safety Risks
by: Youssef, Paul, et al.
Published: (2025)

Can Fine-Tuning Erase Your Edits? On the Fragile Coexistence of Knowledge Editing and Adaptation
by: Cheng, Yinjie, et al.
Published: (2025)

Marcel: A Lightweight and Open-Source Conversational Agent for University Student Support
by: Trienes, Jan, et al.
Published: (2025)

Truth or Twist? Optimal Model Selection for Reliable Label Flipping Evaluation in LLM-based Counterfactuals
by: Wang, Qianli, et al.
Published: (2025)

Parallel Universes, Parallel Languages: A Comprehensive Study on LLM-based Multilingual Counterfactual Example Generation
by: Wang, Qianli, et al.
Published: (2026)

InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification
by: Trienes, Jan, et al.
Published: (2024)

An XAI-based Analysis of Shortcut Learning in Neural Networks
by: Le, Phuong Quynh, et al.
Published: (2025)

Is Last Layer Re-Training Truly Sufficient for Robustness to Spurious Correlations?
by: Le, Phuong Quynh, et al.
Published: (2023)

Explanation format does not matter; but explanations do -- An Eggsbert study on explaining Bayesian Optimisation tasks
by: Chakraborty, Tanmay, et al.
Published: (2025)

Towards Interpretable Deep Neural Networks for Tabular Data
by: Elhadri, Khawla, et al.
Published: (2025)

XNNTab -- Interpretable Neural Networks for Tabular Data using Sparse Autoencoders
by: Elhadri, Khawla, et al.
Published: (2025)

Investigating the Impact of Randomness on Reproducibility in Computer Vision: A Study on Applications in Civil Engineering and Medicine
by: Eryılmaz, Bahadır, et al.
Published: (2024)

DRIV-EX: Counterfactual Explanations for Driving LLMs
by: Cardiel, Amaia, et al.
Published: (2026)

Invariant Learning with Annotation-free Environments
by: Le, Phuong Quynh, et al.
Published: (2025)

Out of Spuriousity: Improving Robustness to Spurious Correlations without Group Annotations
by: Le, Phuong Quynh, et al.
Published: (2024)

Towards Faithful Explanations for Text Classification with Robustness Improvement and Explanation Guided Training
by: Li, Dongfang, et al.
Published: (2023)

Few-Shot Knowledge Distillation of LLMs With Counterfactual Explanations
by: Hamman, Faisal, et al.
Published: (2025)

Patch-based Intuitive Multimodal Prototypes Network (PIMPNet) for Alzheimer's Disease classification
by: De Santi, Lisa Anita, et al.
Published: (2024)

A Comparative Analysis of Counterfactual Explanation Methods for Text Classifiers
by: McAleese, Stephen, et al.
Published: (2024)

Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers
by: Kuhn, Lukas, et al.
Published: (2025)

Counterfactual Simulatability of LLM Explanations for Generation Tasks
by: Limpijankit, Marvin, et al.
Published: (2025)

LLMs Don't Know Their Own Decision Boundaries: The Unreliability of Self-Generated Counterfactual Explanations
by: Mayne, Harry, et al.
Published: (2025)

Prototype-based Interpretable Breast Cancer Prediction Models: Analysis and Challenges
by: Pathak, Shreyasi, et al.
Published: (2024)

Funzac at CoMeDi Shared Task: Modeling Annotator Disagreement from Word-In-Context Perspectives
by: Sarumi, Olufunke O., et al.
Published: (2025)

The Impact of Annotator Personas on LLM Behavior Across the Perspectivism Spectrum
by: Sarumi, Olufunke O., et al.
Published: (2025)

WISE: Weak-Supervision-Guided Step-by-Step Explanations for Multimodal LLMs in Image Classification
by: Jiang, Yiwen, et al.
Published: (2025)

Mitigating Text Toxicity with Counterfactual Generation
by: Bhan, Milan, et al.
Published: (2024)

Prompt-Counterfactual Explanations for Generative AI System Behavior
by: Goethals, Sofie, et al.
Published: (2026)

LIBERTy: A Causal Framework for Benchmarking Concept-Based Explanations of LLMs with Structural Counterfactuals
by: Toker, Gilat, et al.
Published: (2026)