Saved in:
| Main Authors: | Bogacka, Karolina, Höfler, Maximilian, Ganzha, Maria, Samek, Wojciech, Wasielewska-Michniewska, Katarzyna |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.10789 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
by: Achtibat, Reduan, et al.
Published: (2024)
by: Achtibat, Reduan, et al.
Published: (2024)
EVO-LRP: Evolutionary Optimization of LRP for Interpretable Model Explanations
by: Zhang, Emerald, et al.
Published: (2025)
by: Zhang, Emerald, et al.
Published: (2025)
When LRP Diverges from Leave-One-Out in Transformers
by: You, Weiqiu, et al.
Published: (2025)
by: You, Weiqiu, et al.
Published: (2025)
Value bounds and Convergence Analysis for Averages of LRP attributions
by: Binder, Alexander, et al.
Published: (2025)
by: Binder, Alexander, et al.
Published: (2025)
MambaLRP: Explaining Selective State Space Sequence Models
by: Jafari, Farnoush Rezaei, et al.
Published: (2024)
by: Jafari, Farnoush Rezaei, et al.
Published: (2024)
Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability
by: Bakish, Yarden, et al.
Published: (2025)
by: Bakish, Yarden, et al.
Published: (2025)
Quality In / Quality Out: Data quality more relevant than model choice in anomaly detection with the UGR'16
by: Camacho, José, et al.
Published: (2023)
by: Camacho, José, et al.
Published: (2023)
Interpretable Feature Learning in Multivariate Big Data Analysis for Network Monitoring
by: Camacho, José, et al.
Published: (2019)
by: Camacho, José, et al.
Published: (2019)
Reactive Model Correction: Mitigating Harm to Task-Relevant Features via Conditional Bias Suppression
by: Bareeva, Dilyara, et al.
Published: (2024)
by: Bareeva, Dilyara, et al.
Published: (2024)
Model Science: getting serious about verification, explanation and control of AI systems
by: Biecek, Przemyslaw, et al.
Published: (2025)
by: Biecek, Przemyslaw, et al.
Published: (2025)
Position: Explain to Question not to Justify
by: Biecek, Przemyslaw, et al.
Published: (2024)
by: Biecek, Przemyslaw, et al.
Published: (2024)
Benchmarking Uncertainty and its Disentanglement in multi-label Chest X-Ray Classification
by: Baur, Simon, et al.
Published: (2025)
by: Baur, Simon, et al.
Published: (2025)
An accuracy-aware extension to LRP-based pruning for CNNs to prevent cascading accuracy degradation in data-scarce transfer learning
by: Yasui, Daisuke, et al.
Published: (2025)
by: Yasui, Daisuke, et al.
Published: (2025)
ECQ$^{\text{x}}$: Explainability-Driven Quantization for Low-Bit and Sparse DNNs
by: Becking, Daniel, et al.
Published: (2021)
by: Becking, Daniel, et al.
Published: (2021)
Iterative Inference in a Chess-Playing Neural Network
by: Sandmann, Elias, et al.
Published: (2025)
by: Sandmann, Elias, et al.
Published: (2025)
ReMoDetect: Reward Models Recognize Aligned LLM's Generations
by: Lee, Hyunseok, et al.
Published: (2024)
by: Lee, Hyunseok, et al.
Published: (2024)
Contrastive Semantic Projection: Faithful Neuron Labeling with Contrastive Examples
by: Bouanani, Oussama, et al.
Published: (2026)
by: Bouanani, Oussama, et al.
Published: (2026)
PURE: Turning Polysemantic Neurons Into Pure Features by Identifying Relevant Circuits
by: Dreyer, Maximilian, et al.
Published: (2024)
by: Dreyer, Maximilian, et al.
Published: (2024)
From What to How: Attributing CLIP's Latent Components Reveals Unexpected Semantic Reliance
by: Dreyer, Maximilian, et al.
Published: (2025)
by: Dreyer, Maximilian, et al.
Published: (2025)
Explaining Predictive Uncertainty by Exposing Second-Order Effects
by: Bley, Florian, et al.
Published: (2024)
by: Bley, Florian, et al.
Published: (2024)
Atlas-Alignment: Making Interpretability Transferable Across Language Models
by: Puri, Bruno, et al.
Published: (2025)
by: Puri, Bruno, et al.
Published: (2025)
A Privacy Preserving System for Movie Recommendations Using Federated Learning
by: Neumann, David, et al.
Published: (2023)
by: Neumann, David, et al.
Published: (2023)
Mechanistic understanding and validation of large AI models with SemanticLens
by: Dreyer, Maximilian, et al.
Published: (2025)
by: Dreyer, Maximilian, et al.
Published: (2025)
From Attribution Maps to Human-Understandable Explanations through Concept Relevance Propagation
by: Achtibat, Reduan, et al.
Published: (2022)
by: Achtibat, Reduan, et al.
Published: (2022)
Concept activation vectors: a unifying view and adversarial attacks
by: Schnoor, Ekkehard, et al.
Published: (2025)
by: Schnoor, Ekkehard, et al.
Published: (2025)
Post-Hoc Concept Disentanglement: From Correlated to Isolated Concept Representations
by: Erogullari, Eren, et al.
Published: (2025)
by: Erogullari, Eren, et al.
Published: (2025)
Ensuring Medical AI Safety: Interpretability-Driven Detection and Mitigation of Spurious Model Behavior and Associated Data
by: Pahde, Frederik, et al.
Published: (2025)
by: Pahde, Frederik, et al.
Published: (2025)
Optimizing Federated Learning by Entropy-Based Client Selection
by: Lutz, Andreas, et al.
Published: (2024)
by: Lutz, Andreas, et al.
Published: (2024)
Synthetic Datasets for Machine Learning on Spatio-Temporal Graphs using PDEs
by: Arndt, Jost, et al.
Published: (2025)
by: Arndt, Jost, et al.
Published: (2025)
Enriching language models with graph-based context information to better understand textual data
by: Roethel, Albert, et al.
Published: (2023)
by: Roethel, Albert, et al.
Published: (2023)
$α$-TCAV: A Unified Framework for Testing with Concept Activation Vectors
by: Schnoor, Ekkehard, et al.
Published: (2026)
by: Schnoor, Ekkehard, et al.
Published: (2026)
Attribution-Guided Decoding
by: Komorowski, Piotr, et al.
Published: (2025)
by: Komorowski, Piotr, et al.
Published: (2025)
In-Context Learning Can Re-learn Forbidden Tasks
by: Xhonneux, Sophie, et al.
Published: (2024)
by: Xhonneux, Sophie, et al.
Published: (2024)
Pruning By Explaining Revisited: Optimizing Attribution Methods to Prune CNNs and Transformers
by: Hatefi, Sayed Mohammad Vakilzadeh, et al.
Published: (2024)
by: Hatefi, Sayed Mohammad Vakilzadeh, et al.
Published: (2024)
Sparse, Efficient and Explainable Data Attribution with DualXDA
by: Yolcu, Galip Ümit, et al.
Published: (2024)
by: Yolcu, Galip Ümit, et al.
Published: (2024)
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation
by: Kahardipraja, Patrick, et al.
Published: (2025)
by: Kahardipraja, Patrick, et al.
Published: (2025)
From Attribution to Action: A Human-Centered Application of Activation Steering
by: Labarta, Tobias, et al.
Published: (2026)
by: Labarta, Tobias, et al.
Published: (2026)
On the Expressiveness of Rational ReLU Neural Networks With Bounded Depth
by: Averkov, Gennadiy, et al.
Published: (2025)
by: Averkov, Gennadiy, et al.
Published: (2025)
Update Your Transformer to the Latest Release: Re-Basin of Task Vectors
by: Rinaldi, Filippo, et al.
Published: (2025)
by: Rinaldi, Filippo, et al.
Published: (2025)
The Effects of Multi-Task Learning on ReLU Neural Network Functions
by: Nakhleh, Julia, et al.
Published: (2024)
by: Nakhleh, Julia, et al.
Published: (2024)
Similar Items
-
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
by: Achtibat, Reduan, et al.
Published: (2024) -
EVO-LRP: Evolutionary Optimization of LRP for Interpretable Model Explanations
by: Zhang, Emerald, et al.
Published: (2025) -
When LRP Diverges from Leave-One-Out in Transformers
by: You, Weiqiu, et al.
Published: (2025) -
Value bounds and Convergence Analysis for Averages of LRP attributions
by: Binder, Alexander, et al.
Published: (2025) -
MambaLRP: Explaining Selective State Space Sequence Models
by: Jafari, Farnoush Rezaei, et al.
Published: (2024)