:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kharlapenko, Dmitrii, Stolfo, Alessandro, Conmy, Arthur, Sachan, Mrinmaya, Jin, Zhijing
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.04843
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Probing for Arithmetic Errors in Language Models
by: Sun, Yucheng, et al.
Published: (2025)

Scaling sparse feature circuit finding for in-context learning
by: Kharlapenko, Dmitrii, et al.
Published: (2025)

Improving Large Language Model Safety with Contrastive Representation Learning
by: Simko, Samuel, et al.
Published: (2025)

Uncovering Hidden Correctness in LLM Causal Reasoning via Symbolic Verification
by: He, Paul, et al.
Published: (2026)

Quriosity: Analyzing Human Questioning Behavior and Causal Inquiry through Curiosity-Driven Queries
by: Ceraolo, Roberto, et al.
Published: (2024)

On the Emergence of Induction Heads for In-Context Learning
by: Musat, Tiberiu, et al.
Published: (2025)

Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games
by: Piedrahita, David Guzman, et al.
Published: (2025)

Confidence Regulation Neurons in Language Models
by: Stolfo, Alessandro, et al.
Published: (2024)

Dense SAE Latents Are Features, Not Bugs
by: Sun, Xiaoqing, et al.
Published: (2025)

Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?
by: Opedal, Andreas, et al.
Published: (2024)

Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis
by: Jenny, David F., et al.
Published: (2023)

Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?
by: Tarasov, Denis, et al.
Published: (2024)

Autoformalizing Natural Language to First-Order Logic: A Case Study in Logical Fallacy Detection
by: Lalwani, Abhinav, et al.
Published: (2024)

Can Large Language Models Infer Causation from Correlation?
by: Jin, Zhijing, et al.
Published: (2023)

CLadder: Assessing Causal Reasoning in Language Models
by: Jin, Zhijing, et al.
Published: (2023)

CausalCite: A Causal Formulation of Paper Citations
by: Kumar, Ishan, et al.
Published: (2023)

Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors
by: Daheim, Nico, et al.
Published: (2024)

Compose and Fuse: Revisiting the Foundational Bottlenecks in Multimodal Reasoning
by: Wang, Yucheng, et al.
Published: (2025)

SMART: Self-learning Meta-strategy Agent for Reasoning Tasks
by: Liu, Rongxing, et al.
Published: (2024)

Implicit Personalization in Language Models: A Systematic Study
by: Jin, Zhijing, et al.
Published: (2024)

Base Models Know How to Reason, Thinking Models Learn When
by: Venhoff, Constantin, et al.
Published: (2025)

Personalized Exercise Recommendation with Semantically-Grounded Knowledge Tracing
by: Ozyurt, Yilmazcan, et al.
Published: (2025)

Simulating Students or Sycophantic Problem Solving? On Misconception Faithfulness of LLM Simulators
by: Do, Heejin, et al.
Published: (2026)

Variational Classification
by: Dhuliawala, Shehzaad, et al.
Published: (2023)

Automatically Finding Reward Model Biases
by: Wang, Atticus, et al.
Published: (2026)

Understanding Reasoning in Thinking Language Models via Steering Vectors
by: Venhoff, Constantin, et al.
Published: (2025)

Line of Sight: On Linear Representations in VLLMs
by: Rajaram, Achyuta, et al.
Published: (2025)

How to Engage Your Readers? Generating Guiding Questions to Promote Active Reading
by: Cui, Peng, et al.
Published: (2024)

Towards Aligning Language Models with Textual Feedback
by: Lloret, Saüc Abadal, et al.
Published: (2024)

Interpreting Large Text-to-Image Diffusion Models with Dictionary Learning
by: Shabalin, Stepan, et al.
Published: (2025)

Can LLMs Model Incorrect Student Reasoning? A Case Study on Distractor Generation
by: Zengaffinen, Yanick, et al.
Published: (2026)

Benchmarking and Enhancing Text-to-Image Models for Generating Visual Representations in Early Arithmetic Education
by: Wang, Junling, et al.
Published: (2026)

Learning to Reason Efficiently with A* Post-Training
by: Opedal, Andreas, et al.
Published: (2026)

Thought Anchors: Which LLM Reasoning Steps Matter?
by: Bogdan, Paul C., et al.
Published: (2025)

Generating Pedagogically Meaningful Visuals for Math Word Problems: A New Benchmark and Analysis of Text-to-Image Models
by: Wang, Junling, et al.
Published: (2025)

Can Vision-Language Models Solve Visual Math Equations?
by: Choudhury, Monjoy Narayan, et al.
Published: (2025)

SIKeD: Self-guided Iterative Knowledge Distillation for mathematical reasoning
by: Adarsh, Shivam, et al.
Published: (2024)

Multilingual Performance Biases of Large Language Models in Education
by: Gupta, Vansh, et al.
Published: (2025)

Test of Time: Rethinking Temporal Signal of Benchmark Contamination
by: Zhang, Terry Jingchen, et al.
Published: (2025)

Post-Training Language Models for Crosslingual Consistency
by: Liu, Tianyu, et al.
Published: (2026)