:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zheng, Yilun, Ma, Dongyang, Liang, Tian, Xu, Jiahao, Huang, Xinting, Chen, Lihui, Mi, Haitao, Wang, Yan
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2602.08030
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The End of Manual Decoding: Towards Truly End-to-End Language Models
by: Wang, Zhichao, et al.
Published: (2025)

DeepCompress: A Dual Reward Strategy for Dynamically Exploring and Compressing Reasoning Chains
by: Liang, Tian, et al.
Published: (2025)

Less is More: Denoising Knowledge Graphs For Retrieval Augmented Generation
by: Zheng, Yilun, et al.
Published: (2025)

WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality
by: Li, Chunyang, et al.
Published: (2025)

DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning
by: Zhang, Ziyin, et al.
Published: (2025)

Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards
by: Liu, Xiaoyuan, et al.
Published: (2025)

Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal
by: Huang, Jianheng, et al.
Published: (2024)

Graph-O1 : Monte Carlo Tree Search with Reinforcement Learning for Text-Attributed Graph Reasoning
by: Liu, Lihui
Published: (2025)

Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning
by: Panaganti, Kishan, et al.
Published: (2026)

The Pensieve Paradigm: Stateful Language Models Mastering Their Own Context
by: Liu, Xiaoyuan, et al.
Published: (2026)

Block-Attention for Efficient Prefilling
by: Ma, Dongyang, et al.
Published: (2024)

DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning
by: He, Zhiwei, et al.
Published: (2025)

Decomposing Representation Space into Interpretable Subspaces with Unsupervised Learning
by: Huang, Xinting, et al.
Published: (2025)

EconProver: Towards More Economical Test-Time Scaling for Automated Theorem Proving
by: Li, Mukai, et al.
Published: (2025)

DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
by: Yue, Murong, et al.
Published: (2024)

Self-Consistency Boosts Calibration for Math Reasoning
by: Wang, Ante, et al.
Published: (2024)

Offline Learning and Forgetting for Reasoning with Large Language Models
by: Ni, Tianwei, et al.
Published: (2025)

Fine-Grained Self-Endorsement Improves Factuality and Reasoning
by: Wang, Ante, et al.
Published: (2024)

Dual-Uncertainty Guided Policy Learning for Multimodal Reasoning
by: Liu, Rui, et al.
Published: (2025)

Routing-Free Mixture-of-Experts
by: Liu, Yilun, et al.
Published: (2026)

FanChuan: A Multilingual and Graph-Structured Benchmark For Parody Detection and Analysis
by: Zheng, Yilun, et al.
Published: (2025)

LLM Unlearning via Loss Adjustment with Only Forget Data
by: Wang, Yaxuan, et al.
Published: (2024)

Expediting and Elevating Large Language Model Reasoning via Hidden Chain-of-Thought Decoding
by: Liu, Tianqiao, et al.
Published: (2024)

Training LLM Agents for Spontaneous, Reward-Free Self-Evolution via World Knowledge Exploration
by: Zhang, Qifan, et al.
Published: (2026)

HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows
by: Yao, Wenlin, et al.
Published: (2024)

PRISM-MCTS: Learning from Reasoning Trajectories with Metacognitive Reflection
by: Cheng, Siyuan, et al.
Published: (2026)

Reasoning with Sampling: Your Base Model is Smarter Than You Think
by: Karan, Aayush, et al.
Published: (2025)

Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training
by: Wang, Mengru, et al.
Published: (2025)

Safety Recovery in Reasoning Models Is Only a Few Early Steering Steps Away
by: Ghosal, Soumya Suvra, et al.
Published: (2026)

Assessing Logical Reasoning Capabilities of Encoder-Only Transformer Models
by: Pirozelli, Paulo, et al.
Published: (2023)

Revisiting Catastrophic Forgetting in Large Language Model Tuning
by: Li, Hongyu, et al.
Published: (2024)

Contrastive Reasoning Alignment: Reinforcement Learning from Hidden Representations
by: Luo, Haozheng, et al.
Published: (2026)

SciAgent: A Unified Multi-Agent System for Generalistic Scientific Reasoning
by: Li, Xuchen, et al.
Published: (2025)

Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting
by: Diao, Muxi, et al.
Published: (2026)

Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages
by: Chen, Zui, et al.
Published: (2025)

Dual-Head Reasoning Distillation: Improving Classifier Accuracy with Train-Time-Only Reasoning
by: Xu, Jillian, et al.
Published: (2025)

Draft Model Knows When to Stop: Self-Verification Speculative Decoding for Long-Form Generation
by: Zhang, Ziyin, et al.
Published: (2024)

ReasoningGuard: Safeguarding Large Reasoning Models with Inference-time Safety Aha Moments
by: Wang, Yuquan, et al.
Published: (2025)

R-Zero: Self-Evolving Reasoning LLM from Zero Data
by: Huang, Chengsong, et al.
Published: (2025)

SUCEA: Reasoning-Intensive Retrieval for Adversarial Fact-checking through Claim Decomposition and Editing
by: Liu, Hongjun, et al.
Published: (2025)