Saved in:
| Main Authors: | Nguyen, Van Bach, Seifert, Christin, Schlötterer, Jörg |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.04463 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CEval: A Benchmark for Evaluating Counterfactual Text Generation
by: Nguyen, Van Bach, et al.
Published: (2024)
by: Nguyen, Van Bach, et al.
Published: (2024)
LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study
by: Nguyen, Van Bach, et al.
Published: (2024)
by: Nguyen, Van Bach, et al.
Published: (2024)
From Black Boxes to Conversations: Incorporating XAI in a Conversational Agent
by: Nguyen, Van Bach, et al.
Published: (2022)
by: Nguyen, Van Bach, et al.
Published: (2022)
Persuasion Tokens for Editing Factual Knowledge in LLMs
by: Youssef, Paul, et al.
Published: (2026)
by: Youssef, Paul, et al.
Published: (2026)
Tracing and Reversing Edits in LLMs
by: Youssef, Paul, et al.
Published: (2025)
by: Youssef, Paul, et al.
Published: (2025)
How to Make LLMs Forget: On Reversing In-Context Knowledge Edits
by: Youssef, Paul, et al.
Published: (2024)
by: Youssef, Paul, et al.
Published: (2024)
The Queen of England is not England's Queen: On the Lack of Factual Coherency in PLMs
by: Youssef, Paul, et al.
Published: (2024)
by: Youssef, Paul, et al.
Published: (2024)
Enhancing Fact Retrieval in PLMs through Truthfulness
by: Youssef, Paul, et al.
Published: (2024)
by: Youssef, Paul, et al.
Published: (2024)
A Second Look on BASS -- Boosting Abstractive Summarization with Unified Semantic Graphs -- A Replication Study
by: Koraş, Osman Alperen, et al.
Published: (2024)
by: Koraş, Osman Alperen, et al.
Published: (2024)
Has this Fact been Edited? Detecting Knowledge Edits in Language Models
by: Youssef, Paul, et al.
Published: (2024)
by: Youssef, Paul, et al.
Published: (2024)
Behavioral Analysis of Information Salience in Large Language Models
by: Trienes, Jan, et al.
Published: (2025)
by: Trienes, Jan, et al.
Published: (2025)
Position: Editing Large Language Models Poses Serious Safety Risks
by: Youssef, Paul, et al.
Published: (2025)
by: Youssef, Paul, et al.
Published: (2025)
Can Fine-Tuning Erase Your Edits? On the Fragile Coexistence of Knowledge Editing and Adaptation
by: Cheng, Yinjie, et al.
Published: (2025)
by: Cheng, Yinjie, et al.
Published: (2025)
Marcel: A Lightweight and Open-Source Conversational Agent for University Student Support
by: Trienes, Jan, et al.
Published: (2025)
by: Trienes, Jan, et al.
Published: (2025)
Truth or Twist? Optimal Model Selection for Reliable Label Flipping Evaluation in LLM-based Counterfactuals
by: Wang, Qianli, et al.
Published: (2025)
by: Wang, Qianli, et al.
Published: (2025)
Parallel Universes, Parallel Languages: A Comprehensive Study on LLM-based Multilingual Counterfactual Example Generation
by: Wang, Qianli, et al.
Published: (2026)
by: Wang, Qianli, et al.
Published: (2026)
InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification
by: Trienes, Jan, et al.
Published: (2024)
by: Trienes, Jan, et al.
Published: (2024)
An XAI-based Analysis of Shortcut Learning in Neural Networks
by: Le, Phuong Quynh, et al.
Published: (2025)
by: Le, Phuong Quynh, et al.
Published: (2025)
Is Last Layer Re-Training Truly Sufficient for Robustness to Spurious Correlations?
by: Le, Phuong Quynh, et al.
Published: (2023)
by: Le, Phuong Quynh, et al.
Published: (2023)
Explanation format does not matter; but explanations do -- An Eggsbert study on explaining Bayesian Optimisation tasks
by: Chakraborty, Tanmay, et al.
Published: (2025)
by: Chakraborty, Tanmay, et al.
Published: (2025)
Towards Interpretable Deep Neural Networks for Tabular Data
by: Elhadri, Khawla, et al.
Published: (2025)
by: Elhadri, Khawla, et al.
Published: (2025)
XNNTab -- Interpretable Neural Networks for Tabular Data using Sparse Autoencoders
by: Elhadri, Khawla, et al.
Published: (2025)
by: Elhadri, Khawla, et al.
Published: (2025)
Investigating the Impact of Randomness on Reproducibility in Computer Vision: A Study on Applications in Civil Engineering and Medicine
by: Eryılmaz, Bahadır, et al.
Published: (2024)
by: Eryılmaz, Bahadır, et al.
Published: (2024)
DRIV-EX: Counterfactual Explanations for Driving LLMs
by: Cardiel, Amaia, et al.
Published: (2026)
by: Cardiel, Amaia, et al.
Published: (2026)
Invariant Learning with Annotation-free Environments
by: Le, Phuong Quynh, et al.
Published: (2025)
by: Le, Phuong Quynh, et al.
Published: (2025)
Out of Spuriousity: Improving Robustness to Spurious Correlations without Group Annotations
by: Le, Phuong Quynh, et al.
Published: (2024)
by: Le, Phuong Quynh, et al.
Published: (2024)
Towards Faithful Explanations for Text Classification with Robustness Improvement and Explanation Guided Training
by: Li, Dongfang, et al.
Published: (2023)
by: Li, Dongfang, et al.
Published: (2023)
Few-Shot Knowledge Distillation of LLMs With Counterfactual Explanations
by: Hamman, Faisal, et al.
Published: (2025)
by: Hamman, Faisal, et al.
Published: (2025)
Patch-based Intuitive Multimodal Prototypes Network (PIMPNet) for Alzheimer's Disease classification
by: De Santi, Lisa Anita, et al.
Published: (2024)
by: De Santi, Lisa Anita, et al.
Published: (2024)
A Comparative Analysis of Counterfactual Explanation Methods for Text Classifiers
by: McAleese, Stephen, et al.
Published: (2024)
by: McAleese, Stephen, et al.
Published: (2024)
Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers
by: Kuhn, Lukas, et al.
Published: (2025)
by: Kuhn, Lukas, et al.
Published: (2025)
Counterfactual Simulatability of LLM Explanations for Generation Tasks
by: Limpijankit, Marvin, et al.
Published: (2025)
by: Limpijankit, Marvin, et al.
Published: (2025)
LLMs Don't Know Their Own Decision Boundaries: The Unreliability of Self-Generated Counterfactual Explanations
by: Mayne, Harry, et al.
Published: (2025)
by: Mayne, Harry, et al.
Published: (2025)
Prototype-based Interpretable Breast Cancer Prediction Models: Analysis and Challenges
by: Pathak, Shreyasi, et al.
Published: (2024)
by: Pathak, Shreyasi, et al.
Published: (2024)
Funzac at CoMeDi Shared Task: Modeling Annotator Disagreement from Word-In-Context Perspectives
by: Sarumi, Olufunke O., et al.
Published: (2025)
by: Sarumi, Olufunke O., et al.
Published: (2025)
The Impact of Annotator Personas on LLM Behavior Across the Perspectivism Spectrum
by: Sarumi, Olufunke O., et al.
Published: (2025)
by: Sarumi, Olufunke O., et al.
Published: (2025)
WISE: Weak-Supervision-Guided Step-by-Step Explanations for Multimodal LLMs in Image Classification
by: Jiang, Yiwen, et al.
Published: (2025)
by: Jiang, Yiwen, et al.
Published: (2025)
Mitigating Text Toxicity with Counterfactual Generation
by: Bhan, Milan, et al.
Published: (2024)
by: Bhan, Milan, et al.
Published: (2024)
Prompt-Counterfactual Explanations for Generative AI System Behavior
by: Goethals, Sofie, et al.
Published: (2026)
by: Goethals, Sofie, et al.
Published: (2026)
LIBERTy: A Causal Framework for Benchmarking Concept-Based Explanations of LLMs with Structural Counterfactuals
by: Toker, Gilat, et al.
Published: (2026)
by: Toker, Gilat, et al.
Published: (2026)
Similar Items
-
CEval: A Benchmark for Evaluating Counterfactual Text Generation
by: Nguyen, Van Bach, et al.
Published: (2024) -
LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study
by: Nguyen, Van Bach, et al.
Published: (2024) -
From Black Boxes to Conversations: Incorporating XAI in a Conversational Agent
by: Nguyen, Van Bach, et al.
Published: (2022) -
Persuasion Tokens for Editing Factual Knowledge in LLMs
by: Youssef, Paul, et al.
Published: (2026) -
Tracing and Reversing Edits in LLMs
by: Youssef, Paul, et al.
Published: (2025)