Saved in:
| Main Authors: | Eichin, Florian, Du, Yupei, Mondorf, Philipp, Matveev, Maria, Plank, Barbara, Hedderich, Michael A. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.20076 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns
by: Hedderich, Michael A., et al.
Published: (2025)
by: Hedderich, Michael A., et al.
Published: (2025)
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
by: Mondorf, Philipp, et al.
Published: (2024)
by: Mondorf, Philipp, et al.
Published: (2024)
Reason to Rote: Rethinking Memorization in Reasoning
by: Du, Yupei, et al.
Published: (2025)
by: Du, Yupei, et al.
Published: (2025)
Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set
by: Eichin, Florian, et al.
Published: (2025)
by: Eichin, Florian, et al.
Published: (2025)
Copy First, Translate Later: Interpreting Translation Dynamics in Multilingual Pretraining
by: Körner, Felicia, et al.
Published: (2026)
by: Körner, Felicia, et al.
Published: (2026)
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey
by: Mondorf, Philipp, et al.
Published: (2024)
by: Mondorf, Philipp, et al.
Published: (2024)
Understanding When Tree of Thoughts Succeeds: Larger Models Excel in Generation, Not Discrimination
by: Chen, Qiqi, et al.
Published: (2024)
by: Chen, Qiqi, et al.
Published: (2024)
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning
by: Mondorf, Philipp, et al.
Published: (2024)
by: Mondorf, Philipp, et al.
Published: (2024)
Liar, Liar, Logical Mire: A Benchmark for Suppositional Reasoning in Large Language Models
by: Mondorf, Philipp, et al.
Published: (2024)
by: Mondorf, Philipp, et al.
Published: (2024)
Tracing Uncertainty in Language Model "Reasoning"
by: Grünefeld, Nils, et al.
Published: (2026)
by: Grünefeld, Nils, et al.
Published: (2026)
If Probable, Then Acceptable? Understanding Conditional Acceptability Judgments in Large Language Models
by: Orth, Jasmin, et al.
Published: (2025)
by: Orth, Jasmin, et al.
Published: (2025)
LogicSkills: A Structured Benchmark for Formal Reasoning in Large Language Models
by: Rabern, Brian, et al.
Published: (2026)
by: Rabern, Brian, et al.
Published: (2026)
Semantic Component Analysis: Introducing Multi-Topic Distributions to Clustering-Based Topic Modeling
by: Eichin, Florian, et al.
Published: (2024)
by: Eichin, Florian, et al.
Published: (2024)
BlackboxNLP-2025 MIB Shared Task: Exploring Ensemble Strategies for Circuit Localization Methods
by: Mondorf, Philipp, et al.
Published: (2025)
by: Mondorf, Philipp, et al.
Published: (2025)
Reasoning that Travels: Dissecting How Chain-of-Thought Transfers Across Models
by: Cheng, Xinyuan, et al.
Published: (2026)
by: Cheng, Xinyuan, et al.
Published: (2026)
The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It
by: Bertolazzi, Leonardo, et al.
Published: (2025)
by: Bertolazzi, Leonardo, et al.
Published: (2025)
FTFT: Efficient and Robust Fine-Tuning by Transferring Training Dynamics
by: Du, Yupei, et al.
Published: (2023)
by: Du, Yupei, et al.
Published: (2023)
LPDS: Evaluating LLM Robustness Through Logic-Preserving Difficulty Scaling
by: Mondorf, Philipp, et al.
Published: (2026)
by: Mondorf, Philipp, et al.
Published: (2026)
Nonparametric Data Attribution for Diffusion Models
by: Zhao, Yutian, et al.
Published: (2025)
by: Zhao, Yutian, et al.
Published: (2025)
Defending Deep Regression Models against Backdoor Attacks
by: Du, Lingyu, et al.
Published: (2024)
by: Du, Lingyu, et al.
Published: (2024)
Final-Model-Only Data Attribution with a Unifying View of Gradient-Based Methods
by: Wei, Dennis, et al.
Published: (2024)
by: Wei, Dennis, et al.
Published: (2024)
Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning
by: Mondorf, Philipp, et al.
Published: (2025)
by: Mondorf, Philipp, et al.
Published: (2025)
Efficient Sketches for Training Data Attribution and Studying the Loss Landscape
by: Schioppa, Andrea
Published: (2024)
by: Schioppa, Andrea
Published: (2024)
Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models
by: Lin, Jinxu, et al.
Published: (2024)
by: Lin, Jinxu, et al.
Published: (2024)
Probe-Based Data Attribution: Discovering and Mitigating Undesirable Behaviors in LLM Post-Training
by: Xiao, Frank, et al.
Published: (2026)
by: Xiao, Frank, et al.
Published: (2026)
Wrapper Boxes: Faithful Attribution of Model Predictions to Training Data
by: Su, Yiheng, et al.
Published: (2023)
by: Su, Yiheng, et al.
Published: (2023)
Intriguing Properties of Data Attribution on Diffusion Models
by: Zheng, Xiaosen, et al.
Published: (2023)
by: Zheng, Xiaosen, et al.
Published: (2023)
Enhancing Training Data Attribution with Representational Optimization
by: Sun, Weiwei, et al.
Published: (2025)
by: Sun, Weiwei, et al.
Published: (2025)
Unifying Perplexing Behaviors in Modified BP Attributions through Alignment Perspective
by: Zheng, Guanhua, et al.
Published: (2025)
by: Zheng, Guanhua, et al.
Published: (2025)
UniGLM: Training One Unified Language Model for Text-Attributed Graph Embedding
by: Fang, Yi, et al.
Published: (2024)
by: Fang, Yi, et al.
Published: (2024)
Training Feature Attribution for Vision Models
by: Bacha, Aziz, et al.
Published: (2025)
by: Bacha, Aziz, et al.
Published: (2025)
To Know or Not To Know? Analyzing Self-Consistency of Large Language Models under Ambiguity
by: Sedova, Anastasiia, et al.
Published: (2024)
by: Sedova, Anastasiia, et al.
Published: (2024)
Linear Script Representations in Speech Foundation Models Enable Zero-Shot Transliteration
by: Shim, Ryan Soh-Eun, et al.
Published: (2026)
by: Shim, Ryan Soh-Eun, et al.
Published: (2026)
Measuring Non-Adversarial Reproduction of Training Data in Large Language Models
by: Aerni, Michael, et al.
Published: (2024)
by: Aerni, Michael, et al.
Published: (2024)
Application of Sensitivity Analysis Methods for Studying Neural Network Models
by: Miao, Jiaxuan, et al.
Published: (2025)
by: Miao, Jiaxuan, et al.
Published: (2025)
Backward Compatibility in Attributive Explanation and Enhanced Model Training Method
by: Matsuno, Ryuta
Published: (2024)
by: Matsuno, Ryuta
Published: (2024)
GUDA: Counterfactual Group-wise Training Data Attribution for Diffusion Models via Unlearning
by: Murata, Naoki, et al.
Published: (2026)
by: Murata, Naoki, et al.
Published: (2026)
Training Data Attribution via Approximate Unrolled Differentiation
by: Bae, Juhan, et al.
Published: (2024)
by: Bae, Juhan, et al.
Published: (2024)
Hybrid Attribution Priors for Explainable and Robust Model Training
by: Zhang, Zhuoran, et al.
Published: (2025)
by: Zhang, Zhuoran, et al.
Published: (2025)
Disentangling the Roles of Representation and Selection in Data Pruning
by: Du, Yupei, et al.
Published: (2025)
by: Du, Yupei, et al.
Published: (2025)
Similar Items
-
What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns
by: Hedderich, Michael A., et al.
Published: (2025) -
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
by: Mondorf, Philipp, et al.
Published: (2024) -
Reason to Rote: Rethinking Memorization in Reasoning
by: Du, Yupei, et al.
Published: (2025) -
Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set
by: Eichin, Florian, et al.
Published: (2025) -
Copy First, Translate Later: Interpreting Translation Dynamics in Multilingual Pretraining
by: Körner, Felicia, et al.
Published: (2026)