:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sun, Zhishen, Dai, Guang, Tsang, Ivor, Ye, Haishan
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2511.08022
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MSCR: Exploring the Vulnerability of LLMs' Mathematical Reasoning Abilities Using Multi-Source Candidate Replacement
by: Sun, Zhishen, et al.
Published: (2025)

ESSAM: A Novel Competitive Evolution Strategies Approach to Reinforcement Learning for Memory Efficient LLMs Fine-Tuning
by: Sun, Zhishen, et al.
Published: (2026)

FZOO: Fast Zeroth-Order Optimizer for Fine-Tuning Large Language Models towards Adam-Scale Speed
by: Dang, Sizhe, et al.
Published: (2025)

Privacy Leaks by Adversaries: Adversarial Iterations for Membership Inference Attack
by: Xue, Jing, et al.
Published: (2025)

Can Large Reasoning Models Improve Accuracy on Mathematical Tasks Using Flawed Thinking?
by: Amjith, Saraswathy, et al.
Published: (2025)

HC$^2$L: Hybrid and Cooperative Contrastive Learning for Cross-lingual Spoken Language Understanding
by: Xing, Bowen, et al.
Published: (2024)

Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning
by: Zhao, Jun, et al.
Published: (2024)

Large Language Models and Mathematical Reasoning Failures
by: Boye, Johan, et al.
Published: (2025)

From $O(mn)$ to $O(r^2)$: Two-Sided Low-Rank Communication for Adam in Distributed Training with Memory Efficiency
by: Dang, Sizhe, et al.
Published: (2026)

Mathematical Computation and Reasoning Errors by Large Language Models
by: Zhang, Liang, et al.
Published: (2025)

Towards Backdoor-Based Ownership Verification for Vision-Language-Action Models
by: Sun, Ming, et al.
Published: (2026)

I-RAVEN-X: Benchmarking Generalization and Robustness of Analogical and Mathematical Reasoning in Large Language and Reasoning Models
by: Camposampiero, Giacomo, et al.
Published: (2025)

FindTheFlaws: Annotated Errors for Detecting Flawed Reasoning and Scalable Oversight Research
by: Recchia, Gabriel, et al.
Published: (2025)

Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models
by: Zhou, Yue, et al.
Published: (2024)

A Survey on Mathematical Reasoning and Optimization with Large Language Models
by: Forootani, Ali
Published: (2025)

GRPO and Reflection Reward for Mathematical Reasoning in Large Language Models
by: Wang, Zhijie
Published: (2026)

Reasoning Relay: Evaluating Stability and Interchangeability of Large Language Models in Mathematical Reasoning
by: Lu, Leo, et al.
Published: (2025)

Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical Ranges
by: Shrestha, Safal, et al.
Published: (2025)

A Survey on Large Language Models for Mathematical Reasoning
by: Wang, Peng-Yuan, et al.
Published: (2025)

Flaw or Artifact? Rethinking Prompt Sensitivity in Evaluating LLMs
by: Hua, Andong, et al.
Published: (2025)

Time-Annealed Perturbation Sampling: Diverse Generation for Diffusion Language Models
by: Wu, Jingxuan, et al.
Published: (2026)

Flow-Direct: Feedback-Efficient and Reusable Guidance for Flow Models via Non-Parametric Guidance Field
by: Tan, Kim Yong, et al.
Published: (2026)

Sharpness-Aware Black-Box Optimization
by: Ye, Feiyang, et al.
Published: (2024)

Benchmarking Reasoning Robustness in Large Language Models
by: Yu, Tong, et al.
Published: (2025)

Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient
by: Di, Hao, et al.
Published: (2024)

Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
by: Zhao, Yanjun, et al.
Published: (2024)

Order Matters: Exploring Order Sensitivity in Multimodal Large Language Models
by: Tan, Zhijie, et al.
Published: (2024)

Reflective Confidence: Correcting Reasoning Flaws via Online Self-Correction
by: Zeng, Qinglin, et al.
Published: (2025)

M2A: Synergizing Mathematical and Agentic Reasoning in Large Language Models
by: Wang, Junjian, et al.
Published: (2026)

Large Language Models in Numberland: A Quick Test of Their Numerical Reasoning Abilities
by: Rahman, Roussel
Published: (2025)

Towards Robust Mathematical Reasoning
by: Luong, Thang, et al.
Published: (2025)

CAMA: Enhancing Mathematical Reasoning in Large Language Models with Causal Knowledge
by: Zan, Lei, et al.
Published: (2025)

Systematic Optimization of Open Source Large Language Models for Mathematical Reasoning
by: Pawar, Pranav, et al.
Published: (2025)

FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models
by: Yu, Zhouliang, et al.
Published: (2025)

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
by: Mirzadeh, Iman, et al.
Published: (2024)

Key-Point-Driven Mathematical Reasoning Distillation of Large Language Model
by: Zhu, Xunyu, et al.
Published: (2024)

MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models
by: Peng, Shuai, et al.
Published: (2024)

Stepwise Self-Consistent Mathematical Reasoning with Large Language Models
by: Zhao, Zilong, et al.
Published: (2024)

Forward-Backward Reasoning in Large Language Models for Mathematical Verification
by: Jiang, Weisen, et al.
Published: (2023)

Enhancing Mathematical Reasoning in Large Language Models with Self-Consistency-Based Hallucination Detection
by: Liu, MingShan, et al.
Published: (2025)