:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Eichin, Florian, Du, Yupei, Mondorf, Philipp, Matveev, Maria, Plank, Barbara, Hedderich, Michael A.
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2505.20076
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns
by: Hedderich, Michael A., et al.
Published: (2025)

Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
by: Mondorf, Philipp, et al.
Published: (2024)

Reason to Rote: Rethinking Memorization in Reasoning
by: Du, Yupei, et al.
Published: (2025)

Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set
by: Eichin, Florian, et al.
Published: (2025)

Copy First, Translate Later: Interpreting Translation Dynamics in Multilingual Pretraining
by: Körner, Felicia, et al.
Published: (2026)

Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey
by: Mondorf, Philipp, et al.
Published: (2024)

Understanding When Tree of Thoughts Succeeds: Larger Models Excel in Generation, Not Discrimination
by: Chen, Qiqi, et al.
Published: (2024)

Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning
by: Mondorf, Philipp, et al.
Published: (2024)

Liar, Liar, Logical Mire: A Benchmark for Suppositional Reasoning in Large Language Models
by: Mondorf, Philipp, et al.
Published: (2024)

Tracing Uncertainty in Language Model "Reasoning"
by: Grünefeld, Nils, et al.
Published: (2026)

If Probable, Then Acceptable? Understanding Conditional Acceptability Judgments in Large Language Models
by: Orth, Jasmin, et al.
Published: (2025)

LogicSkills: A Structured Benchmark for Formal Reasoning in Large Language Models
by: Rabern, Brian, et al.
Published: (2026)

Semantic Component Analysis: Introducing Multi-Topic Distributions to Clustering-Based Topic Modeling
by: Eichin, Florian, et al.
Published: (2024)

BlackboxNLP-2025 MIB Shared Task: Exploring Ensemble Strategies for Circuit Localization Methods
by: Mondorf, Philipp, et al.
Published: (2025)

Reasoning that Travels: Dissecting How Chain-of-Thought Transfers Across Models
by: Cheng, Xinyuan, et al.
Published: (2026)

The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It
by: Bertolazzi, Leonardo, et al.
Published: (2025)

FTFT: Efficient and Robust Fine-Tuning by Transferring Training Dynamics
by: Du, Yupei, et al.
Published: (2023)

LPDS: Evaluating LLM Robustness Through Logic-Preserving Difficulty Scaling
by: Mondorf, Philipp, et al.
Published: (2026)

Nonparametric Data Attribution for Diffusion Models
by: Zhao, Yutian, et al.
Published: (2025)

Defending Deep Regression Models against Backdoor Attacks
by: Du, Lingyu, et al.
Published: (2024)

Final-Model-Only Data Attribution with a Unifying View of Gradient-Based Methods
by: Wei, Dennis, et al.
Published: (2024)

Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning
by: Mondorf, Philipp, et al.
Published: (2025)

Efficient Sketches for Training Data Attribution and Studying the Loss Landscape
by: Schioppa, Andrea
Published: (2024)

Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models
by: Lin, Jinxu, et al.
Published: (2024)

Probe-Based Data Attribution: Discovering and Mitigating Undesirable Behaviors in LLM Post-Training
by: Xiao, Frank, et al.
Published: (2026)

Wrapper Boxes: Faithful Attribution of Model Predictions to Training Data
by: Su, Yiheng, et al.
Published: (2023)

Intriguing Properties of Data Attribution on Diffusion Models
by: Zheng, Xiaosen, et al.
Published: (2023)

Enhancing Training Data Attribution with Representational Optimization
by: Sun, Weiwei, et al.
Published: (2025)

Unifying Perplexing Behaviors in Modified BP Attributions through Alignment Perspective
by: Zheng, Guanhua, et al.
Published: (2025)

UniGLM: Training One Unified Language Model for Text-Attributed Graph Embedding
by: Fang, Yi, et al.
Published: (2024)

Training Feature Attribution for Vision Models
by: Bacha, Aziz, et al.
Published: (2025)

To Know or Not To Know? Analyzing Self-Consistency of Large Language Models under Ambiguity
by: Sedova, Anastasiia, et al.
Published: (2024)

Linear Script Representations in Speech Foundation Models Enable Zero-Shot Transliteration
by: Shim, Ryan Soh-Eun, et al.
Published: (2026)

Measuring Non-Adversarial Reproduction of Training Data in Large Language Models
by: Aerni, Michael, et al.
Published: (2024)

Application of Sensitivity Analysis Methods for Studying Neural Network Models
by: Miao, Jiaxuan, et al.
Published: (2025)

Backward Compatibility in Attributive Explanation and Enhanced Model Training Method
by: Matsuno, Ryuta
Published: (2024)

GUDA: Counterfactual Group-wise Training Data Attribution for Diffusion Models via Unlearning
by: Murata, Naoki, et al.
Published: (2026)

Training Data Attribution via Approximate Unrolled Differentiation
by: Bae, Juhan, et al.
Published: (2024)

Hybrid Attribution Priors for Explainable and Robust Model Training
by: Zhang, Zhuoran, et al.
Published: (2025)

Disentangling the Roles of Representation and Selection in Data Pruning
by: Du, Yupei, et al.
Published: (2025)