Saved in:
| Main Authors: | Guo, Jiaxing, Yang, Wenjie, Zhang, Shengzhong, Xu, Tongshan, Du, Lun, Zheng, Da, Huang, Zengfeng |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.06877 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Your Graph Recommender is Provably a Single-view Graph Contrastive Learning
by: Yang, Wenjie, et al.
Published: (2024)
by: Yang, Wenjie, et al.
Published: (2024)
Re$^2$Math: Benchmarking Theorem Retrieval in Research-Level Mathematics
by: Lyu, Zicheng, et al.
Published: (2026)
by: Lyu, Zicheng, et al.
Published: (2026)
Can LLMs $\textit{understand}$ Math? -- Exploring the Pitfalls in Mathematical Reasoning
by: Roy, Tiasa Singha, et al.
Published: (2025)
by: Roy, Tiasa Singha, et al.
Published: (2025)
Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models
by: Wang, Binghai, et al.
Published: (2026)
by: Wang, Binghai, et al.
Published: (2026)
Solving Math Word Problems via Cooperative Reasoning induced Language Models
by: Zhu, Xinyu, et al.
Published: (2022)
by: Zhu, Xinyu, et al.
Published: (2022)
Guiding Through Complexity: What Makes Good Supervision for Hard Math Reasoning Tasks?
by: He, Xuan, et al.
Published: (2024)
by: He, Xuan, et al.
Published: (2024)
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
by: Guan, Xinyu, et al.
Published: (2025)
by: Guan, Xinyu, et al.
Published: (2025)
When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs
by: Li, Xiaomin, et al.
Published: (2025)
by: Li, Xiaomin, et al.
Published: (2025)
Correct Is Not Enough: Training Reasoning Planners with Executor-Grounded Rewards
by: Han, Tianyang, et al.
Published: (2026)
by: Han, Tianyang, et al.
Published: (2026)
NFT: Bridging Supervised Learning and Reinforcement Learning in Math Reasoning
by: Chen, Huayu, et al.
Published: (2025)
by: Chen, Huayu, et al.
Published: (2025)
StructComp: Substituting Propagation with Structural Compression in Training Graph Contrastive Learning
by: Zhang, Shengzhong, et al.
Published: (2023)
by: Zhang, Shengzhong, et al.
Published: (2023)
Position: On the Methodological Pitfalls of Evaluating Base LLMs for Reasoning
by: Chan, Jason, et al.
Published: (2025)
by: Chan, Jason, et al.
Published: (2025)
SuperCLUE-Math6: Graded Multi-Step Math Reasoning Benchmark for LLMs in Chinese
by: Xu, Liang, et al.
Published: (2024)
by: Xu, Liang, et al.
Published: (2024)
Can LLMs Solve longer Math Word Problems Better?
by: Xu, Xin, et al.
Published: (2024)
by: Xu, Xin, et al.
Published: (2024)
DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Long and Specialized Documents
by: Zhao, Yilun, et al.
Published: (2023)
by: Zhao, Yilun, et al.
Published: (2023)
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling
by: Liu, Zihan, et al.
Published: (2024)
by: Liu, Zihan, et al.
Published: (2024)
First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning
by: Jain, Kushal, et al.
Published: (2023)
by: Jain, Kushal, et al.
Published: (2023)
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations
by: Huang, Kaixuan, et al.
Published: (2025)
by: Huang, Kaixuan, et al.
Published: (2025)
MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
by: Wang, Lei, et al.
Published: (2024)
by: Wang, Lei, et al.
Published: (2024)
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
by: Liu, Xiaoran, et al.
Published: (2025)
by: Liu, Xiaoran, et al.
Published: (2025)
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
by: Huan, Maggie, et al.
Published: (2025)
by: Huan, Maggie, et al.
Published: (2025)
STAR-PólyaMath: Multi-Agent Reasoning under Persistent Meta-Strategic Supervision
by: Wu, Jiaao, et al.
Published: (2026)
by: Wu, Jiaao, et al.
Published: (2026)
An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning
by: Chen, Zui, et al.
Published: (2024)
by: Chen, Zui, et al.
Published: (2024)
Beyond Reasoning: Reinforcement Learning Unlocks Parametric Knowledge in LLMs
by: Yang, Wanli, et al.
Published: (2026)
by: Yang, Wanli, et al.
Published: (2026)
TabularMath: Understanding Math Reasoning over Tables with Large Language Models
by: Tian, Shi-Yu, et al.
Published: (2025)
by: Tian, Shi-Yu, et al.
Published: (2025)
PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts
by: Wang, Yiming, et al.
Published: (2025)
by: Wang, Yiming, et al.
Published: (2025)
Poivre: Self-Refining Visual Pointing with Reinforcement Learning
by: Yang, Wenjie, et al.
Published: (2025)
by: Yang, Wenjie, et al.
Published: (2025)
LLMs as Assessors: Right for the Right Reason?
by: Saha, Sourav, et al.
Published: (2026)
by: Saha, Sourav, et al.
Published: (2026)
Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations
by: Sun, Jiaxing, et al.
Published: (2024)
by: Sun, Jiaxing, et al.
Published: (2024)
FinanceMath: Knowledge-Intensive Math Reasoning in Finance Domains
by: Zhao, Yilun, et al.
Published: (2023)
by: Zhao, Yilun, et al.
Published: (2023)
Sparse-dLLM: Accelerating Diffusion LLMs with Dynamic Cache Eviction
by: Song, Yuerong, et al.
Published: (2025)
by: Song, Yuerong, et al.
Published: (2025)
Reasoning Isn't Enough: Examining Truth-Bias and Sycophancy in LLMs
by: Barkett, Emilio, et al.
Published: (2025)
by: Barkett, Emilio, et al.
Published: (2025)
MathOPEval: A Fine-grained Evaluation Benchmark for Visual Operations of MLLMs in Mathematical Reasoning
by: Li, Xiaoyuan, et al.
Published: (2025)
by: Li, Xiaoyuan, et al.
Published: (2025)
When Is Thinking Enough? Early Exit via Sufficiency Assessment for Efficient Reasoning
by: Xiang, Yang, et al.
Published: (2026)
by: Xiang, Yang, et al.
Published: (2026)
Beyond Math: Stories as a Testbed for Memorization-Constrained Reasoning in LLMs
by: Jiang, Yuxuan, et al.
Published: (2024)
by: Jiang, Yuxuan, et al.
Published: (2024)
Scheherazade: Evaluating Chain-of-Thought Math Reasoning in LLMs with Chain-of-Problems
by: Miner, Stephen, et al.
Published: (2024)
by: Miner, Stephen, et al.
Published: (2024)
SASFT: Sparse Autoencoder-guided Supervised Finetuning to Mitigate Unexpected Code-Switching in LLMs
by: Deng, Boyi, et al.
Published: (2025)
by: Deng, Boyi, et al.
Published: (2025)
KnowCoder-A1: Incentivizing Agentic Reasoning Capability with Outcome Supervision for KBQA
by: Chen, Zhuo, et al.
Published: (2025)
by: Chen, Zhuo, et al.
Published: (2025)
Fitting Is Not Enough: Smoothness in Extremely Quantized LLMs
by: Xu, Yuzhuang, et al.
Published: (2026)
by: Xu, Yuzhuang, et al.
Published: (2026)
From Large to Tiny: Distilling and Refining Mathematical Expertise for Math Word Problems with Weakly Supervision
by: Lin, Qingwen, et al.
Published: (2024)
by: Lin, Qingwen, et al.
Published: (2024)
Similar Items
-
Your Graph Recommender is Provably a Single-view Graph Contrastive Learning
by: Yang, Wenjie, et al.
Published: (2024) -
Re$^2$Math: Benchmarking Theorem Retrieval in Research-Level Mathematics
by: Lyu, Zicheng, et al.
Published: (2026) -
Can LLMs $\textit{understand}$ Math? -- Exploring the Pitfalls in Mathematical Reasoning
by: Roy, Tiasa Singha, et al.
Published: (2025) -
Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models
by: Wang, Binghai, et al.
Published: (2026) -
Solving Math Word Problems via Cooperative Reasoning induced Language Models
by: Zhu, Xinyu, et al.
Published: (2022)