Saved in:
| Main Authors: | Du, Yupei, Mondorf, Philipp, Casola, Silvia, Yao, Yuekun, Litschko, Robert, Plank, Barbara |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.04782 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
by: Mondorf, Philipp, et al.
Published: (2024)
by: Mondorf, Philipp, et al.
Published: (2024)
Liar, Liar, Logical Mire: A Benchmark for Suppositional Reasoning in Large Language Models
by: Mondorf, Philipp, et al.
Published: (2024)
by: Mondorf, Philipp, et al.
Published: (2024)
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning
by: Mondorf, Philipp, et al.
Published: (2024)
by: Mondorf, Philipp, et al.
Published: (2024)
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey
by: Mondorf, Philipp, et al.
Published: (2024)
by: Mondorf, Philipp, et al.
Published: (2024)
Tracing Uncertainty in Language Model "Reasoning"
by: Grünefeld, Nils, et al.
Published: (2026)
by: Grünefeld, Nils, et al.
Published: (2026)
To Know or Not To Know? Analyzing Self-Consistency of Large Language Models under Ambiguity
by: Sedova, Anastasiia, et al.
Published: (2024)
by: Sedova, Anastasiia, et al.
Published: (2024)
LogicSkills: A Structured Benchmark for Formal Reasoning in Large Language Models
by: Rabern, Brian, et al.
Published: (2026)
by: Rabern, Brian, et al.
Published: (2026)
Reasoning that Travels: Dissecting How Chain-of-Thought Transfers Across Models
by: Cheng, Xinyuan, et al.
Published: (2026)
by: Cheng, Xinyuan, et al.
Published: (2026)
ExPLAIND: Unifying Model, Data, and Training Attribution to Study Model Behavior
by: Eichin, Florian, et al.
Published: (2025)
by: Eichin, Florian, et al.
Published: (2025)
Barriers to Universal Reasoning With Transformers (And How to Overcome Them)
by: Kraus, Oliver, et al.
Published: (2026)
by: Kraus, Oliver, et al.
Published: (2026)
If Probable, Then Acceptable? Understanding Conditional Acceptability Judgments in Large Language Models
by: Orth, Jasmin, et al.
Published: (2025)
by: Orth, Jasmin, et al.
Published: (2025)
Loop, Think, & Generalize: Implicit Reasoning in Recurrent-Depth Transformers
by: Kohli, Harsh, et al.
Published: (2026)
by: Kohli, Harsh, et al.
Published: (2026)
Resource-Lean Lexicon Induction for German Dialects
by: Litschko, Robert, et al.
Published: (2026)
by: Litschko, Robert, et al.
Published: (2026)
BlackboxNLP-2025 MIB Shared Task: Exploring Ensemble Strategies for Circuit Localization Methods
by: Mondorf, Philipp, et al.
Published: (2025)
by: Mondorf, Philipp, et al.
Published: (2025)
Decoupling the Effect of Chain-of-Thought Reasoning: A Human Label Variation Perspective
by: Chen, Beiduo, et al.
Published: (2026)
by: Chen, Beiduo, et al.
Published: (2026)
From Memorization to Reasoning in the Spectrum of Loss Curvature
by: Merullo, Jack, et al.
Published: (2025)
by: Merullo, Jack, et al.
Published: (2025)
MaiNLP at SemEval-2024 Task 1: Analyzing Source Language Selection in Cross-Lingual Textual Relatedness
by: Zhou, Shijia, et al.
Published: (2024)
by: Zhou, Shijia, et al.
Published: (2024)
The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It
by: Bertolazzi, Leonardo, et al.
Published: (2025)
by: Bertolazzi, Leonardo, et al.
Published: (2025)
Memorization vs. Reasoning: Updating LLMs with New Knowledge
by: Li, Aochong Oliver, et al.
Published: (2025)
by: Li, Aochong Oliver, et al.
Published: (2025)
Cross-Dialect Information Retrieval: Information Access in Low-Resource and High-Variance Languages
by: Litschko, Robert, et al.
Published: (2024)
by: Litschko, Robert, et al.
Published: (2024)
Rethinking LLM Memorization through the Lens of Adversarial Compression
by: Schwarzschild, Avi, et al.
Published: (2024)
by: Schwarzschild, Avi, et al.
Published: (2024)
Donkii: Can Annotation Error Detection Methods Find Errors in Instruction-Tuning Datasets?
by: Weber-Genzel, Leon, et al.
Published: (2023)
by: Weber-Genzel, Leon, et al.
Published: (2023)
FTFT: Efficient and Robust Fine-Tuning by Transferring Training Dynamics
by: Du, Yupei, et al.
Published: (2023)
by: Du, Yupei, et al.
Published: (2023)
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination
by: Wu, Mingqi, et al.
Published: (2025)
by: Wu, Mingqi, et al.
Published: (2025)
Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs
by: Lou, Siyu, et al.
Published: (2024)
by: Lou, Siyu, et al.
Published: (2024)
Make Every Letter Count: Building Dialect Variation Dictionaries from Monolingual Corpora
by: Litschko, Robert, et al.
Published: (2025)
by: Litschko, Robert, et al.
Published: (2025)
Information Asymmetry across Language Varieties: A Case Study on Cantonese-Mandarin and Bavarian-German QA
by: Pei, Renhao, et al.
Published: (2026)
by: Pei, Renhao, et al.
Published: (2026)
Inference-Time Rethinking with Latent Thought Vectors for Math Reasoning
by: Kong, Deqian, et al.
Published: (2026)
by: Kong, Deqian, et al.
Published: (2026)
Understanding When Tree of Thoughts Succeeds: Larger Models Excel in Generation, Not Discrimination
by: Chen, Qiqi, et al.
Published: (2024)
by: Chen, Qiqi, et al.
Published: (2024)
Rethinking Entropy Regularization in Large Reasoning Models
by: Jiang, Yuxian, et al.
Published: (2025)
by: Jiang, Yuxian, et al.
Published: (2025)
Evaluating Large Language Models for Cross-Lingual Retrieval
by: Zuo, Longfei, et al.
Published: (2025)
by: Zuo, Longfei, et al.
Published: (2025)
Exploring Large Language Models for Product Attribute Value Identification
by: Sabeh, Kassem, et al.
Published: (2024)
by: Sabeh, Kassem, et al.
Published: (2024)
An Empirical Comparison of Generative Approaches for Product Attribute-Value Identification
by: Sabeh, Kassem, et al.
Published: (2024)
by: Sabeh, Kassem, et al.
Published: (2024)
Rethinking Expert Trajectory Utilization in LLM Post-training for Mathematical Reasoning
by: Ding, Bowen, et al.
Published: (2025)
by: Ding, Bowen, et al.
Published: (2025)
Feature Extraction and Steering for Enhanced Chain-of-Thought Reasoning in Language Models
by: Li, Zihao, et al.
Published: (2025)
by: Li, Zihao, et al.
Published: (2025)
Rethinking Chain-of-Thought Reasoning for Videos
by: Zhong, Yiwu, et al.
Published: (2025)
by: Zhong, Yiwu, et al.
Published: (2025)
Language models can learn implicit multi-hop reasoning, but only if they have lots of training data
by: Yao, Yuekun, et al.
Published: (2025)
by: Yao, Yuekun, et al.
Published: (2025)
SteerEval: Inference-time Interventions Strengthen Multilingual Generalization in Neural Summarization Metrics
by: Casola, Silvia, et al.
Published: (2026)
by: Casola, Silvia, et al.
Published: (2026)
"Seeing the Big through the Small": Can LLMs Approximate Human Judgment Distributions on NLI from a Few Explanations?
by: Chen, Beiduo, et al.
Published: (2024)
by: Chen, Beiduo, et al.
Published: (2024)
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?
by: Yan, Kai, et al.
Published: (2025)
by: Yan, Kai, et al.
Published: (2025)
Similar Items
-
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
by: Mondorf, Philipp, et al.
Published: (2024) -
Liar, Liar, Logical Mire: A Benchmark for Suppositional Reasoning in Large Language Models
by: Mondorf, Philipp, et al.
Published: (2024) -
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning
by: Mondorf, Philipp, et al.
Published: (2024) -
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey
by: Mondorf, Philipp, et al.
Published: (2024) -
Tracing Uncertainty in Language Model "Reasoning"
by: Grünefeld, Nils, et al.
Published: (2026)