Saved in:
| Main Authors: | Lu, Junjie, Liu, Yuliang, Qu, Chaofeng, Shen, Wei, Lin, Zhouhan, Zhang, Chuheng, Xu, Min |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.11104 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence
by: Liu, Yuliang, et al.
Published: (2025)
by: Liu, Yuliang, et al.
Published: (2025)
Agentic Reasoning: A Streamlined Framework for Enhancing LLM Reasoning with Agentic Tools
by: Wu, Junde, et al.
Published: (2025)
by: Wu, Junde, et al.
Published: (2025)
Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths
by: Chia, Yew Ken, et al.
Published: (2024)
by: Chia, Yew Ken, et al.
Published: (2024)
Iterative Reasoning Preference Optimization
by: Pang, Richard Yuanzhe, et al.
Published: (2024)
by: Pang, Richard Yuanzhe, et al.
Published: (2024)
Theory-Grounded Evaluation of Human-Like Fallacy Patterns in LLM Reasoning
by: Richardson, Andrew Keenan, et al.
Published: (2025)
by: Richardson, Andrew Keenan, et al.
Published: (2025)
Advancing LLM Reasoning Generalists with Preference Trees
by: Yuan, Lifan, et al.
Published: (2024)
by: Yuan, Lifan, et al.
Published: (2024)
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
by: Shen, Maohao, et al.
Published: (2025)
by: Shen, Maohao, et al.
Published: (2025)
PGPO: Enhancing Agent Reasoning via Pseudocode-style Planning Guided Preference Optimization
by: Cao, Zouying, et al.
Published: (2025)
by: Cao, Zouying, et al.
Published: (2025)
Probability-Consistent Preference Optimization for Enhanced LLM Reasoning
by: Yang, Yunqiao, et al.
Published: (2025)
by: Yang, Yunqiao, et al.
Published: (2025)
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
by: Qi, Penghui, et al.
Published: (2025)
by: Qi, Penghui, et al.
Published: (2025)
Efficient Reasoning Through Suppression of Self-Affirmation Reflections in Large Reasoning Models
by: Liu, Kaiyuan, et al.
Published: (2025)
by: Liu, Kaiyuan, et al.
Published: (2025)
The Evolution of Thought: Tracking LLM Overthinking via Reasoning Dynamics Analysis
by: Wei, Zihao, et al.
Published: (2025)
by: Wei, Zihao, et al.
Published: (2025)
CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
by: Pourreza, Mohammadreza, et al.
Published: (2024)
by: Pourreza, Mohammadreza, et al.
Published: (2024)
Reasoning Aware Self-Consistency: Leveraging Reasoning Paths for Efficient LLM Sampling
by: Wan, Guangya, et al.
Published: (2024)
by: Wan, Guangya, et al.
Published: (2024)
Reliable Reasoning Path: Distilling Effective Guidance for LLM Reasoning with Knowledge Graphs
by: Xiao, Yilin, et al.
Published: (2025)
by: Xiao, Yilin, et al.
Published: (2025)
Thought-Like-Pro: Enhancing Reasoning of Large Language Models through Self-Driven Prolog-based Chain-of-Thought
by: Tan, Xiaoyu, et al.
Published: (2024)
by: Tan, Xiaoyu, et al.
Published: (2024)
RAPO: Risk-Aware Preference Optimization for Generalizable Safe Reasoning
by: Wei, Zeming, et al.
Published: (2026)
by: Wei, Zeming, et al.
Published: (2026)
AgenticMath: Enhancing LLM Reasoning via Agentic-based Math Data Generation
by: Liu, Xianyang, et al.
Published: (2025)
by: Liu, Xianyang, et al.
Published: (2025)
Stabilizing Reasoning in Medical LLMs with Continued Pretraining and Reasoning Preference Optimization
by: Kawakami, Wataru, et al.
Published: (2025)
by: Kawakami, Wataru, et al.
Published: (2025)
Reasoning Like a Doctor: Improving Medical Dialogue Systems via Diagnostic Reasoning Process Alignment
by: Xu, Kaishuai, et al.
Published: (2024)
by: Xu, Kaishuai, et al.
Published: (2024)
Step-level Value Preference Optimization for Mathematical Reasoning
by: Chen, Guoxin, et al.
Published: (2024)
by: Chen, Guoxin, et al.
Published: (2024)
Mitigating Overthinking in Large Reasoning Language Models via Reasoning Path Deviation Monitoring
by: Guan, Weixin, et al.
Published: (2026)
by: Guan, Weixin, et al.
Published: (2026)
FlowRL: Matching Reward Distributions for LLM Reasoning
by: Zhu, Xuekai, et al.
Published: (2025)
by: Zhu, Xuekai, et al.
Published: (2025)
PathCoT: Chain-of-Thought Prompting for Zero-shot Pathology Visual Reasoning
by: Zhou, Junjie, et al.
Published: (2025)
by: Zhou, Junjie, et al.
Published: (2025)
Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models
by: Singh, Joykirat, et al.
Published: (2025)
by: Singh, Joykirat, et al.
Published: (2025)
Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI
by: Wang, Yuxia, et al.
Published: (2025)
by: Wang, Yuxia, et al.
Published: (2025)
Enhancing LLM Reasoning with Reward-guided Tree Search
by: Jiang, Jinhao, et al.
Published: (2024)
by: Jiang, Jinhao, et al.
Published: (2024)
CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think
by: Shen, Junzhe, et al.
Published: (2026)
by: Shen, Junzhe, et al.
Published: (2026)
Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective
by: Liu, Junnan, et al.
Published: (2025)
by: Liu, Junnan, et al.
Published: (2025)
Reason-Align-Respond: Aligning LLM Reasoning with Knowledge Graphs for KGQA
by: Shen, Xiangqing, et al.
Published: (2025)
by: Shen, Xiangqing, et al.
Published: (2025)
Don't Take Things Out of Context: Attention Intervention for Enhancing Chain-of-Thought Reasoning in Large Language Models
by: Yan, Shaotian, et al.
Published: (2025)
by: Yan, Shaotian, et al.
Published: (2025)
Generative Adversarial Reasoner: Enhancing LLM Reasoning with Adversarial Reinforcement Learning
by: Liu, Qihao, et al.
Published: (2025)
by: Liu, Qihao, et al.
Published: (2025)
Dissecting Human and LLM Preferences
by: Li, Junlong, et al.
Published: (2024)
by: Li, Junlong, et al.
Published: (2024)
Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
by: Zhang, Kongcheng, et al.
Published: (2025)
by: Zhang, Kongcheng, et al.
Published: (2025)
Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization
by: Tang, Haochun, et al.
Published: (2026)
by: Tang, Haochun, et al.
Published: (2026)
AdapThink: Adaptive Thinking Preferences for Reasoning Language Model
by: Wan, Xu, et al.
Published: (2025)
by: Wan, Xu, et al.
Published: (2025)
Subtle Errors in Reasoning: Preference Learning via Error-injected Self-editing
by: Xu, Kaishuai, et al.
Published: (2024)
by: Xu, Kaishuai, et al.
Published: (2024)
Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning
by: Hassid, Michael, et al.
Published: (2025)
by: Hassid, Michael, et al.
Published: (2025)
How Much Can RAG Help the Reasoning of LLM?
by: Liu, Jingyu, et al.
Published: (2024)
by: Liu, Jingyu, et al.
Published: (2024)
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task
by: Yan, Yuchen, et al.
Published: (2025)
by: Yan, Yuchen, et al.
Published: (2025)
Similar Items
-
AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence
by: Liu, Yuliang, et al.
Published: (2025) -
Agentic Reasoning: A Streamlined Framework for Enhancing LLM Reasoning with Agentic Tools
by: Wu, Junde, et al.
Published: (2025) -
Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths
by: Chia, Yew Ken, et al.
Published: (2024) -
Iterative Reasoning Preference Optimization
by: Pang, Richard Yuanzhe, et al.
Published: (2024) -
Theory-Grounded Evaluation of Human-Like Fallacy Patterns in LLM Reasoning
by: Richardson, Andrew Keenan, et al.
Published: (2025)