:: Library Catalog

$Cover Image$

Saved in:

Bibliographic Details
Main Authors:	Cheng, Ziling, Cao, Meng, Pishdad, Leila, Cao, Yanshuai, Cheung, Jackie Chi Kit
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2505.23701
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs
by: Cheng, Ziling, et al.
Published: (2025)

PreSumm: Predicting Summarization Performance Without Summarizing
by: Koniaev, Steven, et al.
Published: (2025)

Mechanistic Understanding and Mitigation of Language Model Non-Factual Hallucinations
by: Yu, Lei, et al.
Published: (2024)

Thinking Without Words: Efficient Latent Reasoning with Abstract Chain-of-Thought
by: Ramji, Keshav, et al.
Published: (2026)

Solving the Challenge Set without Solving the Task: On Winograd Schemas as a Test of Pronominal Coreference Resolution
by: Porada, Ian, et al.
Published: (2024)

Ensemble Distillation for Unsupervised Constituency Parsing
by: Shayegh, Behzad, et al.
Published: (2023)

Can LLMs Solve longer Math Word Problems Better?
by: Xu, Xin, et al.
Published: (2024)

Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
by: Chehbouni, Khaoula, et al.
Published: (2025)

Towards Learning to Reason: Comparing LLMs with Neuro-Symbolic on Arithmetic Relations in Abstract Reasoning
by: Hersche, Michael, et al.
Published: (2024)

CoT-Pose: Chain-of-Thought Reasoning for 3D Pose Generation from Abstract Prompts
by: Cha, Junuk, et al.
Published: (2025)

Reasoning as an Attack Surface: Adaptive Evolutionary CoT Jailbreaks for LLMs
by: Li, Jianan, et al.
Published: (2026)

A Controlled Reevaluation of Coreference Resolution Models
by: Porada, Ian, et al.
Published: (2024)

Does This Summary Answer My Question? Modeling Query-Focused Summary Readers with Rational Speech Acts
by: Piano, Cesare Spinoso-Di, et al.
Published: (2024)

CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models
by: Qian, Cheng, et al.
Published: (2023)

The Illusion of Reasoning: Exposing Evasive Data Contamination in LLMs via Zero-CoT Truncation
by: Lan, Yifan, et al.
Published: (2026)

Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning
by: Li, Jiachun, et al.
Published: (2024)

LLMs Faithfully and Iteratively Compute Answers During CoT: A Systematic Analysis With Multi-step Arithmetics
by: Kudo, Keito, et al.
Published: (2024)

What Makes Math Word Problems Challenging for LLMs?
by: Srivatsa, KV Aditya, et al.
Published: (2024)

Self-consistent Reasoning For Solving Math Word Problems
by: Xiong, Jing, et al.
Published: (2022)

Improving the Calibration of Confidence Scores in Text Generation Using the Output Distribution's Characteristics
by: Flores, Lorenzo Jaime Yu, et al.
Published: (2025)

$\texttt{COSMIC}$: Mutual Information for Task-Agnostic Summarization Evaluation
by: Darrin, Maxime, et al.
Published: (2024)

Can Vision Language Models Be Adaptive in Mathematics Education? A Learner Model-based Rubric Study
by: Gao, Jie, et al.
Published: (2026)

Long or short CoT? Investigating Instance-level Switch of Large Reasoning Models
by: Zhang, Ruiqi, et al.
Published: (2025)

Structured Reasoning with Tree-of-Thoughts for Bengali Math Word Problems
by: Mahmood, Aurprita, et al.
Published: (2025)

Augmenting Math Word Problems via Iterative Question Composing
by: Liu, Haoxiong, et al.
Published: (2024)

CoT Vectors: Transferring and Probing the Reasoning Mechanisms of LLMs
by: Li, Li, et al.
Published: (2025)

Linear Half-Space Problems in Kinetic Theory: Abstract Formulation and Regime Transitions
by: Bernhoff, Niclas
Published: (2022)

Connecting the Dots: Evaluating Abstract Reasoning Capabilities of LLMs Using the New York Times Connections Word Game
by: Samadarshi, Prisha, et al.
Published: (2024)

A Unified View of Abstract Visual Reasoning Problems
by: Małkiński, Mikołaj, et al.
Published: (2024)

Understanding Formal Reasoning Failures in LLMs as Abstract Interpreters
by: Mitchell, Jacqueline L., et al.
Published: (2025)

Knowledge-Augmented Long-CoT Generation for Complex Biomolecular Reasoning
by: Lyu, Tianwen, et al.
Published: (2025)

On the Morse Index with Constraints I: An Abstract Formulation
by: Tran, Hung, et al.
Published: (2020)

Adversarial Math Word Problem Generation
by: Xie, Roy, et al.
Published: (2024)

How Likely Do LLMs with CoT Mimic Human Reasoning?
by: Bao, Guangsheng, et al.
Published: (2024)

Confident in a Confidence Score: Investigating the Sensitivity of Confidence Scores to Supervised Fine-Tuning
by: Flores, Lorenzo Jaime Yu, et al.
Published: (2026)

Challenges to Evaluating the Generalization of Coreference Resolution Models: A Measurement Modeling Perspective
by: Porada, Ian, et al.
Published: (2023)

$(RSA)^2$: A Rhetorical-Strategy-Aware Rational Speech Act Framework for Figurative Language Understanding
by: Piano, Cesare Spinoso-Di, et al.
Published: (2025)

Revisiting Disentanglement in Downstream Tasks: A Study on Its Necessity for Abstract Visual Reasoning
by: Nai, Ruiqian, et al.
Published: (2024)

When LLMs Meet API Documentation: Can Retrieval Augmentation Aid Code Generation Just as It Helps Developers?
by: Chen, Jingyi, et al.
Published: (2025)

Solving Math Word Problems via Cooperative Reasoning induced Language Models
by: Zhu, Xinyu, et al.
Published: (2022)