Saved in:
| Main Authors: | Zheng, Xiang, Zhai, Weiqi, Wang, Wei, Yang, Boyu, Li, Wenbo, Luo, Ruixiang, Sun, Haoxiang, Wang, Yucheng, Li, Zhengze, Wang, Meng, Du, Yuetian, Lin, Guojie, Wang, Yaxuan, Xu, Xiaoxiao, Mo, Yanhu, Ren, Xuan, Wei, Hu, Zhao, Bing |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.00564 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MedPRMBench: A Fine-grained Benchmark for Process Reward Models in Medical Reasoning
by: Wu, Lingyan, et al.
Published: (2026)
by: Wu, Lingyan, et al.
Published: (2026)
DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
by: Sun, Haoxiang, et al.
Published: (2026)
by: Sun, Haoxiang, et al.
Published: (2026)
SKYLENAGE Technical Report: Mathematical Reasoning and Contest-Innovation Benchmarks for Multi-Level Math Evaluation
by: Wei, Hu, et al.
Published: (2025)
by: Wei, Hu, et al.
Published: (2025)
SCOPE: Compress Mathematical Reasoning Steps for Efficient Automated Process Annotation
by: Xu, Huimin, et al.
Published: (2025)
by: Xu, Huimin, et al.
Published: (2025)
From Mathematical Reasoning to Code: Generalization of Process Reward Models in Test-Time Scaling
by: Chen, Zhengyu, et al.
Published: (2025)
by: Chen, Zhengyu, et al.
Published: (2025)
Patterns Over Principles: The Fragility of Inductive Reasoning in LLMs under Noisy Observations
by: Li, Chunyang, et al.
Published: (2025)
by: Li, Chunyang, et al.
Published: (2025)
PRIME: A Process-Outcome Alignment Benchmark for Verifiable Reasoning in Mathematics and Engineering
by: Wang, Xiangfeng, et al.
Published: (2026)
by: Wang, Xiangfeng, et al.
Published: (2026)
Unlocking Multimodal Mathematical Reasoning via Process Reward Model
by: Luo, Ruilin, et al.
Published: (2025)
by: Luo, Ruilin, et al.
Published: (2025)
Discovering Process-Outcome Credit in Multi-Step LLM Reasoning
by: Wang, Xiangwei, et al.
Published: (2026)
by: Wang, Xiangwei, et al.
Published: (2026)
Hydra-Nav: Object Navigation via Adaptive Dual-Process Reasoning
by: Wang, Zixuan, et al.
Published: (2026)
by: Wang, Zixuan, et al.
Published: (2026)
ProcessBench: Identifying Process Errors in Mathematical Reasoning
by: Zheng, Chujie, et al.
Published: (2024)
by: Zheng, Chujie, et al.
Published: (2024)
Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning
by: Zhu, Jiachen, et al.
Published: (2025)
by: Zhu, Jiachen, et al.
Published: (2025)
Are Smarter LLMs Safer? Exploring Safety-Reasoning Trade-offs in Prompting and Fine-Tuning
by: Li, Ang, et al.
Published: (2025)
by: Li, Ang, et al.
Published: (2025)
Athena: Enhancing Multimodal Reasoning with Data-efficient Process Reward Models
by: Wang, Shuai, et al.
Published: (2025)
by: Wang, Shuai, et al.
Published: (2025)
An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems
by: Hao, Yuren, et al.
Published: (2025)
by: Hao, Yuren, et al.
Published: (2025)
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot
by: Jiang, Ruixiang, et al.
Published: (2025)
by: Jiang, Ruixiang, et al.
Published: (2025)
Process-of-Thought Reasoning for Videos
by: Zhang, Jusheng, et al.
Published: (2026)
by: Zhang, Jusheng, et al.
Published: (2026)
Olapa-MCoT: Enhancing the Chinese Mathematical Reasoning Capability of LLMs
by: Zhu, Shaojie, et al.
Published: (2023)
by: Zhu, Shaojie, et al.
Published: (2023)
Rewarding Graph Reasoning Process makes LLMs more Generalized Reasoners
by: Peng, Miao, et al.
Published: (2025)
by: Peng, Miao, et al.
Published: (2025)
Graph Contrastive Invariant Learning from the Causal Perspective
by: Mo, Yanhu, et al.
Published: (2024)
by: Mo, Yanhu, et al.
Published: (2024)
Quantization Meets Reasoning: Exploring and Mitigating Degradation of Low-Bit LLMs in Mathematical Reasoning
by: Li, Zhen, et al.
Published: (2025)
by: Li, Zhen, et al.
Published: (2025)
FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning
by: Wang, Zeyu, et al.
Published: (2026)
by: Wang, Zeyu, et al.
Published: (2026)
Examining False Positives under Inference Scaling for Mathematical Reasoning
by: Wang, Yu, et al.
Published: (2025)
by: Wang, Yu, et al.
Published: (2025)
Process or Result? Manipulated Ending Tokens Can Mislead Reasoning LLMs to Ignore the Correct Reasoning Steps
by: Cui, Yu, et al.
Published: (2025)
by: Cui, Yu, et al.
Published: (2025)
Linking Perception, Confidence and Accuracy in MLLMs
by: Du, Yuetian, et al.
Published: (2026)
by: Du, Yuetian, et al.
Published: (2026)
BilliardPhys-Bench: Benchmarking Physical Reasoning and Visual Dynamics of Multimodal LLMs
by: Wang, Ben, et al.
Published: (2026)
by: Wang, Ben, et al.
Published: (2026)
MathScale: Scaling Instruction Tuning for Mathematical Reasoning
by: Tang, Zhengyang, et al.
Published: (2024)
by: Tang, Zhengyang, et al.
Published: (2024)
STAR : Bridging Statistical and Agentic Reasoning for Large Model Performance Prediction
by: Wang, Xiaoxiao, et al.
Published: (2026)
by: Wang, Xiaoxiao, et al.
Published: (2026)
DRAGON: Guard LLM Unlearning in Context via Negative Detection and Reasoning
by: Wang, Yaxuan, et al.
Published: (2025)
by: Wang, Yaxuan, et al.
Published: (2025)
CRISP: Compressed Reasoning via Iterative Self-Policy Distillation
by: Sang, Hejian, et al.
Published: (2026)
by: Sang, Hejian, et al.
Published: (2026)
MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset
by: Wang, Weiqi, et al.
Published: (2024)
by: Wang, Weiqi, et al.
Published: (2024)
Agentic Proposing: Enhancing Large Language Model Reasoning via Compositional Skill Synthesis
by: Jiao, Zhengbo, et al.
Published: (2026)
by: Jiao, Zhengbo, et al.
Published: (2026)
Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning Processes
by: Wang, Dingzirui, et al.
Published: (2024)
by: Wang, Dingzirui, et al.
Published: (2024)
Improve Mathematical Reasoning in Language Models by Automated Process Supervision
by: Luo, Liangchen, et al.
Published: (2024)
by: Luo, Liangchen, et al.
Published: (2024)
Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning
by: Chegini, Atoosa, et al.
Published: (2026)
by: Chegini, Atoosa, et al.
Published: (2026)
ReasVQA: Advancing VideoQA with Imperfect Reasoning Process
by: Liang, Jianxin, et al.
Published: (2025)
by: Liang, Jianxin, et al.
Published: (2025)
Reasoning Through Execution: Unifying Process and Outcome Rewards for Code Generation
by: Yu, Zhuohao, et al.
Published: (2024)
by: Yu, Zhuohao, et al.
Published: (2024)
Adaptive Test-Time Compute Allocation for Reasoning LLMs via Constrained Policy Optimization
by: Zhai, Zhiyuan, et al.
Published: (2026)
by: Zhai, Zhiyuan, et al.
Published: (2026)
Making Mathematical Reasoning Adaptive
by: Lai, Zhejian, et al.
Published: (2025)
by: Lai, Zhejian, et al.
Published: (2025)
Query-focused and Memory-aware Reranker for Long Context Processing
by: Li, Yuqing, et al.
Published: (2026)
by: Li, Yuqing, et al.
Published: (2026)
Similar Items
-
MedPRMBench: A Fine-grained Benchmark for Process Reward Models in Medical Reasoning
by: Wu, Lingyan, et al.
Published: (2026) -
DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
by: Sun, Haoxiang, et al.
Published: (2026) -
SKYLENAGE Technical Report: Mathematical Reasoning and Contest-Innovation Benchmarks for Multi-Level Math Evaluation
by: Wei, Hu, et al.
Published: (2025) -
SCOPE: Compress Mathematical Reasoning Steps for Efficient Automated Process Annotation
by: Xu, Huimin, et al.
Published: (2025) -
From Mathematical Reasoning to Code: Generalization of Process Reward Models in Test-Time Scaling
by: Chen, Zhengyu, et al.
Published: (2025)