:: Library Catalog

$Cover Image$

Saved in:

Bibliographic Details
Main Authors:	Jiang, Yuxuan, Ferraro, Francis
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2412.14368
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Bridging Reasoning Trajectories in On-Policy Distillation via Near-Future Guidance
by: Jiang, Yuxuan, et al.
Published: (2026)

DRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models
by: Jiang, Yuxuan, et al.
Published: (2025)

SAGA: A Participant-specific Examination of Story Alternatives and Goal Applicability for a Deeper Understanding of Complex Events
by: Vallurupalli, Sai, et al.
Published: (2024)

CoRE: Condition-based Reasoning for Identifying Outcome Variance in Complex Events
by: Vallurupalli, Sai, et al.
Published: (2025)

Explore the Reasoning Capability of LLMs in the Chess Testbed
by: Wang, Shu, et al.
Published: (2024)

Memorization or Reasoning? Exploring the Idiom Understanding of LLMs
by: Kim, Jisu, et al.
Published: (2025)

Beyond Memorization: Distinguishing between Reductive and Epistemic Reasoning in LLMs using Classic Logic Puzzles
by: Gabay, Adi, et al.
Published: (2026)

Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations
by: Sun, Jiaxing, et al.
Published: (2024)

World Models for Math Story Problems
by: Opedal, Andreas, et al.
Published: (2023)

Unveiling Over-Memorization in Finetuning LLMs for Reasoning Tasks
by: Ruan, Zhiwen, et al.
Published: (2025)

Beyond Benchmarks: MathArena as an Evaluation Platform for Mathematics with LLMs
by: Dekoninck, Jasper, et al.
Published: (2026)

DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Long and Specialized Documents
by: Zhao, Yilun, et al.
Published: (2023)

MatheMagic: Generating Dynamic Mathematics Benchmarks Robust to Memorization
by: O'Brien, Dayyán, et al.
Published: (2025)

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
by: Hans, Abhimanyu, et al.
Published: (2024)

Beyond Memorization: Testing LLM Reasoning on Unseen Theory of Computation Tasks
by: Shelat, Shlok, et al.
Published: (2026)

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
by: Guan, Xinyu, et al.
Published: (2025)

Memorization vs. Reasoning: Updating LLMs with New Knowledge
by: Li, Aochong Oliver, et al.
Published: (2025)

TAG-EQA: Text-And-Graph for Event Question Answering via Structured Prompting Strategies
by: Kadam, Maithili, et al.
Published: (2025)

SuperCLUE-Math6: Graded Multi-Step Math Reasoning Benchmark for LLMs in Chinese
by: Xu, Liang, et al.
Published: (2024)

Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs
by: Lou, Siyu, et al.
Published: (2024)

Scheherazade: Evaluating Chain-of-Thought Math Reasoning in LLMs with Chain-of-Problems
by: Miner, Stephen, et al.
Published: (2024)

MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
by: Wang, Lei, et al.
Published: (2024)

Right Is Not Enough: The Pitfalls of Outcome Supervision in Training LLMs for Math Reasoning
by: Guo, Jiaxing, et al.
Published: (2025)

UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization
by: Uddin, Md Nayem, et al.
Published: (2024)

CoinMath: Harnessing the Power of Coding Instruction for Math LLMs
by: Wei, Chengwei, et al.
Published: (2024)

MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark
by: Liu, Hongwei, et al.
Published: (2024)

Reason to Rote: Rethinking Memorization in Reasoning
by: Du, Yupei, et al.
Published: (2025)

If We May De-Presuppose: Robustly Verifying Claims through Presupposition-Free Question Decomposition
by: Dipta, Shubhashis Roy, et al.
Published: (2025)

Inductive Bias Extraction and Matching for LLM Prompts
by: Angel, Christian M., et al.
Published: (2025)

Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On
by: Zeng, Liang, et al.
Published: (2024)

Premise-Augmented Reasoning Chains Improve Error Identification in Math reasoning with LLMs
by: Mukherjee, Sagnik, et al.
Published: (2025)

FinanceMath: Knowledge-Intensive Math Reasoning in Finance Domains
by: Zhao, Yilun, et al.
Published: (2023)

MathArena: Evaluating LLMs on Uncontaminated Math Competitions
by: Balunović, Mislav, et al.
Published: (2025)

Can LLMs $\textit{understand}$ Math? -- Exploring the Pitfalls in Mathematical Reasoning
by: Roy, Tiasa Singha, et al.
Published: (2025)

The CompMath-MCQ Dataset: Are LLMs Ready for Higher-Level Math?
by: Raimondi, Bianca, et al.
Published: (2026)

An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning
by: Chen, Zui, et al.
Published: (2024)

Beyond Memorization: The Challenge of Random Memory Access in Language Models
by: Zhu, Tongyao, et al.
Published: (2024)

Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs
by: Kassem, Aly M., et al.
Published: (2024)

Mitigating Memorization in LLMs using Activation Steering
by: Suri, Manan, et al.
Published: (2025)

Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval
by: Dipta, Shubhashis Roy, et al.
Published: (2025)