Saved in:
| Main Authors: | Elhadri, Khawla, Michalski, Tomasz, Wróbel, Adam, Schlötterer, Jörg, Zieliński, Bartosz, Seifert, Christin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.09340 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Towards Interpretable Deep Neural Networks for Tabular Data
by: Elhadri, Khawla, et al.
Published: (2025)
by: Elhadri, Khawla, et al.
Published: (2025)
XNNTab -- Interpretable Neural Networks for Tabular Data using Sparse Autoencoders
by: Elhadri, Khawla, et al.
Published: (2025)
by: Elhadri, Khawla, et al.
Published: (2025)
Persuasion Tokens for Editing Factual Knowledge in LLMs
by: Youssef, Paul, et al.
Published: (2026)
by: Youssef, Paul, et al.
Published: (2026)
Invariant Learning with Annotation-free Environments
by: Le, Phuong Quynh, et al.
Published: (2025)
by: Le, Phuong Quynh, et al.
Published: (2025)
Out of Spuriousity: Improving Robustness to Spurious Correlations without Group Annotations
by: Le, Phuong Quynh, et al.
Published: (2024)
by: Le, Phuong Quynh, et al.
Published: (2024)
A Second Look on BASS -- Boosting Abstractive Summarization with Unified Semantic Graphs -- A Replication Study
by: Koraş, Osman Alperen, et al.
Published: (2024)
by: Koraş, Osman Alperen, et al.
Published: (2024)
An XAI-based Analysis of Shortcut Learning in Neural Networks
by: Le, Phuong Quynh, et al.
Published: (2025)
by: Le, Phuong Quynh, et al.
Published: (2025)
Is Last Layer Re-Training Truly Sufficient for Robustness to Spurious Correlations?
by: Le, Phuong Quynh, et al.
Published: (2023)
by: Le, Phuong Quynh, et al.
Published: (2023)
Shortcut Mitigation via Spurious-Positive Samples
by: Le, Phuong Quynh, et al.
Published: (2026)
by: Le, Phuong Quynh, et al.
Published: (2026)
Personalized Interpretability -- Interactive Alignment of Prototypical Parts Networks
by: Michalski, Tomasz, et al.
Published: (2025)
by: Michalski, Tomasz, et al.
Published: (2025)
Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers
by: Kuhn, Lukas, et al.
Published: (2025)
by: Kuhn, Lukas, et al.
Published: (2025)
OMENN: One Matrix to Explain Neural Networks
by: Wróbel, Adam, et al.
Published: (2024)
by: Wróbel, Adam, et al.
Published: (2024)
Prototype-based Interpretable Breast Cancer Prediction Models: Analysis and Challenges
by: Pathak, Shreyasi, et al.
Published: (2024)
by: Pathak, Shreyasi, et al.
Published: (2024)
ProtoQuant: Quantization of Prototypical Parts For General and Fine-Grained Image Classification
by: Janusz, Mikołaj, et al.
Published: (2026)
by: Janusz, Mikołaj, et al.
Published: (2026)
The Queen of England is not England's Queen: On the Lack of Factual Coherency in PLMs
by: Youssef, Paul, et al.
Published: (2024)
by: Youssef, Paul, et al.
Published: (2024)
Enhancing Fact Retrieval in PLMs through Truthfulness
by: Youssef, Paul, et al.
Published: (2024)
by: Youssef, Paul, et al.
Published: (2024)
Fragment-Wise Interpretability in Graph Neural Networks via Molecule Decomposition and Contribution Analysis
by: Musiał, Sebastian, et al.
Published: (2025)
by: Musiał, Sebastian, et al.
Published: (2025)
LucidPPN: Unambiguous Prototypical Parts Network for User-centric Interpretable Computer Vision
by: Pach, Mateusz, et al.
Published: (2024)
by: Pach, Mateusz, et al.
Published: (2024)
Guiding LLMs to Generate High-Fidelity and High-Quality Counterfactual Explanations for Text Classification
by: Nguyen, Van Bach, et al.
Published: (2025)
by: Nguyen, Van Bach, et al.
Published: (2025)
From Black Boxes to Conversations: Incorporating XAI in a Conversational Agent
by: Nguyen, Van Bach, et al.
Published: (2022)
by: Nguyen, Van Bach, et al.
Published: (2022)
CEval: A Benchmark for Evaluating Counterfactual Text Generation
by: Nguyen, Van Bach, et al.
Published: (2024)
by: Nguyen, Van Bach, et al.
Published: (2024)
Has this Fact been Edited? Detecting Knowledge Edits in Language Models
by: Youssef, Paul, et al.
Published: (2024)
by: Youssef, Paul, et al.
Published: (2024)
Patch-based Intuitive Multimodal Prototypes Network (PIMPNet) for Alzheimer's Disease classification
by: De Santi, Lisa Anita, et al.
Published: (2024)
by: De Santi, Lisa Anita, et al.
Published: (2024)
Enhancing Chemical Explainability Through Counterfactual Masking
by: Janisiów, Łukasz, et al.
Published: (2025)
by: Janisiów, Łukasz, et al.
Published: (2025)
DAVE: Distribution-aware Attribution via ViT Gradient Decomposition
by: Wróbel, Adam, et al.
Published: (2026)
by: Wróbel, Adam, et al.
Published: (2026)
Behavioral Analysis of Information Salience in Large Language Models
by: Trienes, Jan, et al.
Published: (2025)
by: Trienes, Jan, et al.
Published: (2025)
Tracing and Reversing Edits in LLMs
by: Youssef, Paul, et al.
Published: (2025)
by: Youssef, Paul, et al.
Published: (2025)
How to Make LLMs Forget: On Reversing In-Context Knowledge Edits
by: Youssef, Paul, et al.
Published: (2024)
by: Youssef, Paul, et al.
Published: (2024)
Comparative Explanations: Explanation Guided Decision Making for Human-in-the-Loop Preference Selection
by: Chakraborty, Tanmay, et al.
Published: (2025)
by: Chakraborty, Tanmay, et al.
Published: (2025)
PhAME: Phenotype-Aware Molecular Editing via Latent Diffusion
by: Janisiów, Łukasz, et al.
Published: (2026)
by: Janisiów, Łukasz, et al.
Published: (2026)
Explainable Bayesian Optimization
by: Chakraborty, Tanmay, et al.
Published: (2024)
by: Chakraborty, Tanmay, et al.
Published: (2024)
Position: Editing Large Language Models Poses Serious Safety Risks
by: Youssef, Paul, et al.
Published: (2025)
by: Youssef, Paul, et al.
Published: (2025)
LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study
by: Nguyen, Van Bach, et al.
Published: (2024)
by: Nguyen, Van Bach, et al.
Published: (2024)
Investigating the Impact of Randomness on Reproducibility in Computer Vision: A Study on Applications in Civil Engineering and Medicine
by: Eryılmaz, Bahadır, et al.
Published: (2024)
by: Eryılmaz, Bahadır, et al.
Published: (2024)
Efficient LLM Moderation with Multi-Layer Latent Prototypes
by: Chrabąszcz, Maciej, et al.
Published: (2025)
by: Chrabąszcz, Maciej, et al.
Published: (2025)
ProPML: Probability Partial Multi-label Learning
by: Struski, Łukasz, et al.
Published: (2024)
by: Struski, Łukasz, et al.
Published: (2024)
SONG: Self-Organizing Neural Graphs
by: Struski, Łukasz, et al.
Published: (2021)
by: Struski, Łukasz, et al.
Published: (2021)
One Mask to Rule Them All: On Hidden Facts after Editing and How to Find Them
by: Holmov, Ali, et al.
Published: (2026)
by: Holmov, Ali, et al.
Published: (2026)
Efficient Multi-Source Knowledge Transfer by Model Merging
by: Osial, Marcin, et al.
Published: (2025)
by: Osial, Marcin, et al.
Published: (2025)
Can Fine-Tuning Erase Your Edits? On the Fragile Coexistence of Knowledge Editing and Adaptation
by: Cheng, Yinjie, et al.
Published: (2025)
by: Cheng, Yinjie, et al.
Published: (2025)
Similar Items
-
Towards Interpretable Deep Neural Networks for Tabular Data
by: Elhadri, Khawla, et al.
Published: (2025) -
XNNTab -- Interpretable Neural Networks for Tabular Data using Sparse Autoencoders
by: Elhadri, Khawla, et al.
Published: (2025) -
Persuasion Tokens for Editing Factual Knowledge in LLMs
by: Youssef, Paul, et al.
Published: (2026) -
Invariant Learning with Annotation-free Environments
by: Le, Phuong Quynh, et al.
Published: (2025) -
Out of Spuriousity: Improving Robustness to Spurious Correlations without Group Annotations
by: Le, Phuong Quynh, et al.
Published: (2024)