:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zheng, Xiang, Zhai, Weiqi, Wang, Wei, Yang, Boyu, Li, Wenbo, Luo, Ruixiang, Sun, Haoxiang, Wang, Yucheng, Li, Zhengze, Wang, Meng, Du, Yuetian, Lin, Guojie, Wang, Yaxuan, Xu, Xiaoxiao, Mo, Yanhu, Ren, Xuan, Wei, Hu, Zhao, Bing
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2602.00564
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MedPRMBench: A Fine-grained Benchmark for Process Reward Models in Medical Reasoning
by: Wu, Lingyan, et al.
Published: (2026)

DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
by: Sun, Haoxiang, et al.
Published: (2026)

SKYLENAGE Technical Report: Mathematical Reasoning and Contest-Innovation Benchmarks for Multi-Level Math Evaluation
by: Wei, Hu, et al.
Published: (2025)

SCOPE: Compress Mathematical Reasoning Steps for Efficient Automated Process Annotation
by: Xu, Huimin, et al.
Published: (2025)

From Mathematical Reasoning to Code: Generalization of Process Reward Models in Test-Time Scaling
by: Chen, Zhengyu, et al.
Published: (2025)

Patterns Over Principles: The Fragility of Inductive Reasoning in LLMs under Noisy Observations
by: Li, Chunyang, et al.
Published: (2025)

PRIME: A Process-Outcome Alignment Benchmark for Verifiable Reasoning in Mathematics and Engineering
by: Wang, Xiangfeng, et al.
Published: (2026)

Unlocking Multimodal Mathematical Reasoning via Process Reward Model
by: Luo, Ruilin, et al.
Published: (2025)

Discovering Process-Outcome Credit in Multi-Step LLM Reasoning
by: Wang, Xiangwei, et al.
Published: (2026)

Hydra-Nav: Object Navigation via Adaptive Dual-Process Reasoning
by: Wang, Zixuan, et al.
Published: (2026)

ProcessBench: Identifying Process Errors in Mathematical Reasoning
by: Zheng, Chujie, et al.
Published: (2024)

Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning
by: Zhu, Jiachen, et al.
Published: (2025)

Are Smarter LLMs Safer? Exploring Safety-Reasoning Trade-offs in Prompting and Fine-Tuning
by: Li, Ang, et al.
Published: (2025)

Athena: Enhancing Multimodal Reasoning with Data-efficient Process Reward Models
by: Wang, Shuai, et al.
Published: (2025)

An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems
by: Hao, Yuren, et al.
Published: (2025)

Multimodal LLMs Can Reason about Aesthetics in Zero-Shot
by: Jiang, Ruixiang, et al.
Published: (2025)

Process-of-Thought Reasoning for Videos
by: Zhang, Jusheng, et al.
Published: (2026)

Olapa-MCoT: Enhancing the Chinese Mathematical Reasoning Capability of LLMs
by: Zhu, Shaojie, et al.
Published: (2023)

Rewarding Graph Reasoning Process makes LLMs more Generalized Reasoners
by: Peng, Miao, et al.
Published: (2025)

Graph Contrastive Invariant Learning from the Causal Perspective
by: Mo, Yanhu, et al.
Published: (2024)

Quantization Meets Reasoning: Exploring and Mitigating Degradation of Low-Bit LLMs in Mathematical Reasoning
by: Li, Zhen, et al.
Published: (2025)

FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning
by: Wang, Zeyu, et al.
Published: (2026)

Examining False Positives under Inference Scaling for Mathematical Reasoning
by: Wang, Yu, et al.
Published: (2025)

Process or Result? Manipulated Ending Tokens Can Mislead Reasoning LLMs to Ignore the Correct Reasoning Steps
by: Cui, Yu, et al.
Published: (2025)

Linking Perception, Confidence and Accuracy in MLLMs
by: Du, Yuetian, et al.
Published: (2026)

BilliardPhys-Bench: Benchmarking Physical Reasoning and Visual Dynamics of Multimodal LLMs
by: Wang, Ben, et al.
Published: (2026)

MathScale: Scaling Instruction Tuning for Mathematical Reasoning
by: Tang, Zhengyang, et al.
Published: (2024)

STAR : Bridging Statistical and Agentic Reasoning for Large Model Performance Prediction
by: Wang, Xiaoxiao, et al.
Published: (2026)

DRAGON: Guard LLM Unlearning in Context via Negative Detection and Reasoning
by: Wang, Yaxuan, et al.
Published: (2025)

CRISP: Compressed Reasoning via Iterative Self-Policy Distillation
by: Sang, Hejian, et al.
Published: (2026)

MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset
by: Wang, Weiqi, et al.
Published: (2024)

Agentic Proposing: Enhancing Large Language Model Reasoning via Compositional Skill Synthesis
by: Jiao, Zhengbo, et al.
Published: (2026)

Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning Processes
by: Wang, Dingzirui, et al.
Published: (2024)

Improve Mathematical Reasoning in Language Models by Automated Process Supervision
by: Luo, Liangchen, et al.
Published: (2024)

Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning
by: Chegini, Atoosa, et al.
Published: (2026)

ReasVQA: Advancing VideoQA with Imperfect Reasoning Process
by: Liang, Jianxin, et al.
Published: (2025)

Reasoning Through Execution: Unifying Process and Outcome Rewards for Code Generation
by: Yu, Zhuohao, et al.
Published: (2024)

Adaptive Test-Time Compute Allocation for Reasoning LLMs via Constrained Policy Optimization
by: Zhai, Zhiyuan, et al.
Published: (2026)

Making Mathematical Reasoning Adaptive
by: Lai, Zhejian, et al.
Published: (2025)

Query-focused and Memory-aware Reranker for Long Context Processing
by: Li, Yuqing, et al.
Published: (2026)