:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lu, Junjie, Liu, Yuliang, Qu, Chaofeng, Shen, Wei, Lin, Zhouhan, Zhang, Chuheng, Xu, Min
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2510.11104
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence
by: Liu, Yuliang, et al.
Published: (2025)

Agentic Reasoning: A Streamlined Framework for Enhancing LLM Reasoning with Agentic Tools
by: Wu, Junde, et al.
Published: (2025)

Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths
by: Chia, Yew Ken, et al.
Published: (2024)

Iterative Reasoning Preference Optimization
by: Pang, Richard Yuanzhe, et al.
Published: (2024)

Theory-Grounded Evaluation of Human-Like Fallacy Patterns in LLM Reasoning
by: Richardson, Andrew Keenan, et al.
Published: (2025)

Advancing LLM Reasoning Generalists with Preference Trees
by: Yuan, Lifan, et al.
Published: (2024)

Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
by: Shen, Maohao, et al.
Published: (2025)

PGPO: Enhancing Agent Reasoning via Pseudocode-style Planning Guided Preference Optimization
by: Cao, Zouying, et al.
Published: (2025)

Probability-Consistent Preference Optimization for Enhanced LLM Reasoning
by: Yang, Yunqiao, et al.
Published: (2025)

Optimizing Anytime Reasoning via Budget Relative Policy Optimization
by: Qi, Penghui, et al.
Published: (2025)

Efficient Reasoning Through Suppression of Self-Affirmation Reflections in Large Reasoning Models
by: Liu, Kaiyuan, et al.
Published: (2025)

The Evolution of Thought: Tracking LLM Overthinking via Reasoning Dynamics Analysis
by: Wei, Zihao, et al.
Published: (2025)

CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
by: Pourreza, Mohammadreza, et al.
Published: (2024)

Reasoning Aware Self-Consistency: Leveraging Reasoning Paths for Efficient LLM Sampling
by: Wan, Guangya, et al.
Published: (2024)

Reliable Reasoning Path: Distilling Effective Guidance for LLM Reasoning with Knowledge Graphs
by: Xiao, Yilin, et al.
Published: (2025)

Thought-Like-Pro: Enhancing Reasoning of Large Language Models through Self-Driven Prolog-based Chain-of-Thought
by: Tan, Xiaoyu, et al.
Published: (2024)

RAPO: Risk-Aware Preference Optimization for Generalizable Safe Reasoning
by: Wei, Zeming, et al.
Published: (2026)

AgenticMath: Enhancing LLM Reasoning via Agentic-based Math Data Generation
by: Liu, Xianyang, et al.
Published: (2025)

Stabilizing Reasoning in Medical LLMs with Continued Pretraining and Reasoning Preference Optimization
by: Kawakami, Wataru, et al.
Published: (2025)

Reasoning Like a Doctor: Improving Medical Dialogue Systems via Diagnostic Reasoning Process Alignment
by: Xu, Kaishuai, et al.
Published: (2024)

Step-level Value Preference Optimization for Mathematical Reasoning
by: Chen, Guoxin, et al.
Published: (2024)

Mitigating Overthinking in Large Reasoning Language Models via Reasoning Path Deviation Monitoring
by: Guan, Weixin, et al.
Published: (2026)

FlowRL: Matching Reward Distributions for LLM Reasoning
by: Zhu, Xuekai, et al.
Published: (2025)

PathCoT: Chain-of-Thought Prompting for Zero-shot Pathology Visual Reasoning
by: Zhou, Junjie, et al.
Published: (2025)

Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models
by: Singh, Joykirat, et al.
Published: (2025)

Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI
by: Wang, Yuxia, et al.
Published: (2025)

Enhancing LLM Reasoning with Reward-guided Tree Search
by: Jiang, Jinhao, et al.
Published: (2024)

CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think
by: Shen, Junzhe, et al.
Published: (2026)

Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective
by: Liu, Junnan, et al.
Published: (2025)

Reason-Align-Respond: Aligning LLM Reasoning with Knowledge Graphs for KGQA
by: Shen, Xiangqing, et al.
Published: (2025)

Don't Take Things Out of Context: Attention Intervention for Enhancing Chain-of-Thought Reasoning in Large Language Models
by: Yan, Shaotian, et al.
Published: (2025)

Generative Adversarial Reasoner: Enhancing LLM Reasoning with Adversarial Reinforcement Learning
by: Liu, Qihao, et al.
Published: (2025)

Dissecting Human and LLM Preferences
by: Li, Junlong, et al.
Published: (2024)

Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
by: Zhang, Kongcheng, et al.
Published: (2025)

Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization
by: Tang, Haochun, et al.
Published: (2026)

AdapThink: Adaptive Thinking Preferences for Reasoning Language Model
by: Wan, Xu, et al.
Published: (2025)

Subtle Errors in Reasoning: Preference Learning via Error-injected Self-editing
by: Xu, Kaishuai, et al.
Published: (2024)

Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning
by: Hassid, Michael, et al.
Published: (2025)

How Much Can RAG Help the Reasoning of LLM?
by: Liu, Jingyu, et al.
Published: (2024)

MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task
by: Yan, Yuchen, et al.
Published: (2025)