Saved in:
| Main Author: | Sun, Xinhai |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.08520 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
From Entropy to Calibrated Uncertainty: Training Language Models to Reason About Uncertainty
by: Jenane, Azza, et al.
Published: (2026)
by: Jenane, Azza, et al.
Published: (2026)
Leveraging Self-Consistency for Data-Efficient Amortized Bayesian Inference
by: Schmitt, Marvin, et al.
Published: (2023)
by: Schmitt, Marvin, et al.
Published: (2023)
Tracing Uncertainty in Language Model "Reasoning"
by: Grünefeld, Nils, et al.
Published: (2026)
by: Grünefeld, Nils, et al.
Published: (2026)
Reinforcement Learning for Sequence Design Leveraging Protein Language Models
by: Subramanian, Jithendaraa, et al.
Published: (2024)
by: Subramanian, Jithendaraa, et al.
Published: (2024)
Reasoning Language Model Inference Serving Unveiled: An Empirical Study
by: Li, Qi, et al.
Published: (2025)
by: Li, Qi, et al.
Published: (2025)
Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models
by: Sakarvadia, Mansi, et al.
Published: (2023)
by: Sakarvadia, Mansi, et al.
Published: (2023)
Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models
by: Sui, Yuan, et al.
Published: (2025)
by: Sui, Yuan, et al.
Published: (2025)
Accelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding
by: Zhao, Yilong, et al.
Published: (2025)
by: Zhao, Yilong, et al.
Published: (2025)
PAHQ: Accelerating Automated Circuit Discovery through Mixed-Precision Inference Optimization
by: Wang, Xinhai, et al.
Published: (2025)
by: Wang, Xinhai, et al.
Published: (2025)
Online Preference-based Reinforcement Learning with Self-augmented Feedback from Large Language Model
by: Tu, Songjun, et al.
Published: (2024)
by: Tu, Songjun, et al.
Published: (2024)
Reinforcing Chain-of-Thought Reasoning with Self-Evolving Rubrics
by: Sheng, Leheng, et al.
Published: (2026)
by: Sheng, Leheng, et al.
Published: (2026)
Beyond Correctness: Confidence-Aware Reward Modeling for Enhancing Large Language Model Reasoning
by: He, Qianxi, et al.
Published: (2025)
by: He, Qianxi, et al.
Published: (2025)
Synthetic Error Injection Fails to Elicit Self-Correction In Language Models
by: Wu, David X., et al.
Published: (2025)
by: Wu, David X., et al.
Published: (2025)
Optimal Self-Consistency for Efficient Reasoning with Large Language Models
by: Feng, Austin, et al.
Published: (2025)
by: Feng, Austin, et al.
Published: (2025)
Quantifying and Understanding Uncertainty in Large Reasoning Models
by: Li, Yangyi, et al.
Published: (2026)
by: Li, Yangyi, et al.
Published: (2026)
When to ASK: Uncertainty-Gated Language Assistance for Reinforcement Learning
by: Monteiro, Juarez, et al.
Published: (2026)
by: Monteiro, Juarez, et al.
Published: (2026)
Self-Correction Bench: Uncovering and Addressing the Self-Correction Blind Spot in Large Language Models
by: Tsui, Ken
Published: (2025)
by: Tsui, Ken
Published: (2025)
Adaptive Negative Reinforcement for LLM Reasoning:Dynamically Balancing Correction and Diversity in RLVR
by: Ingle, Yash, et al.
Published: (2026)
by: Ingle, Yash, et al.
Published: (2026)
GNNRL-Smoothing: A Prior-Free Reinforcement Learning Model for Mesh Smoothing
by: Wang, Zhichao, et al.
Published: (2024)
by: Wang, Zhichao, et al.
Published: (2024)
FastVLM: Self-Speculative Decoding for Fast Vision-Language Model Inference
by: Bajpai, Divya Jyoti, et al.
Published: (2025)
by: Bajpai, Divya Jyoti, et al.
Published: (2025)
PALADIN: Self-Correcting Language Model Agents to Cure Tool-Failure Cases
by: Vuddanti, Sri Vatsa, et al.
Published: (2025)
by: Vuddanti, Sri Vatsa, et al.
Published: (2025)
Reasoning on the Manifold: Bidirectional Consistency for Self-Verification in Diffusion Language Models
by: Ruan, Jiaoyang, et al.
Published: (2026)
by: Ruan, Jiaoyang, et al.
Published: (2026)
The Geometry of Self-Verification in a Task-Specific Reasoning Model
by: Lee, Andrew, et al.
Published: (2025)
by: Lee, Andrew, et al.
Published: (2025)
Can Large Reasoning Models do Analogical Reasoning under Perceptual Uncertainty?
by: Camposampiero, Giacomo, et al.
Published: (2025)
by: Camposampiero, Giacomo, et al.
Published: (2025)
ConU: Conformal Uncertainty in Large Language Models with Correctness Coverage Guarantees
by: Wang, Zhiyuan, et al.
Published: (2024)
by: Wang, Zhiyuan, et al.
Published: (2024)
Active Preference Inference using Language Models and Probabilistic Reasoning
by: Piriyakulkij, Wasu Top, et al.
Published: (2023)
by: Piriyakulkij, Wasu Top, et al.
Published: (2023)
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
by: Cui, Ganqu, et al.
Published: (2025)
by: Cui, Ganqu, et al.
Published: (2025)
Robust Uncertainty Quantification for Self-Evolving Large Language Models via Continual Domain Pretraining
by: Zhou, Xiaofan, et al.
Published: (2025)
by: Zhou, Xiaofan, et al.
Published: (2025)
Mitigating Overthinking in Large Reasoning Models via Difficulty-aware Reinforcement Learning
by: Wan, Qian, et al.
Published: (2026)
by: Wan, Qian, et al.
Published: (2026)
GDSD: Reinforcement Learning as Guided Denoiser Self-Distillation for Diffusion Language Models
by: Tang, Xiaohang, et al.
Published: (2026)
by: Tang, Xiaohang, et al.
Published: (2026)
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
by: Wang, Haozhe, et al.
Published: (2025)
by: Wang, Haozhe, et al.
Published: (2025)
A Simple Unified Uncertainty-Guided Framework for Offline-to-Online Reinforcement Learning
by: Guo, Siyuan, et al.
Published: (2023)
by: Guo, Siyuan, et al.
Published: (2023)
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity
by: Kim, Kwanyoung, et al.
Published: (2025)
by: Kim, Kwanyoung, et al.
Published: (2025)
MeshONet: A Generalizable and Efficient Operator Learning Method for Structured Mesh Generation
by: Xiao, Jing, et al.
Published: (2025)
by: Xiao, Jing, et al.
Published: (2025)
Stabilizing Reinforcement Learning for Diffusion Language Models
by: Zhong, Jianyuan, et al.
Published: (2026)
by: Zhong, Jianyuan, et al.
Published: (2026)
Self-Hinting Language Models Enhance Reinforcement Learning
by: Liao, Baohao, et al.
Published: (2026)
by: Liao, Baohao, et al.
Published: (2026)
TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning
by: Zhang, Tunyu, et al.
Published: (2025)
by: Zhang, Tunyu, et al.
Published: (2025)
Reasoning Circuits in Language Models: A Mechanistic Interpretation of Syllogistic Inference
by: Kim, Geonhee, et al.
Published: (2024)
by: Kim, Geonhee, et al.
Published: (2024)
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
by: Xi, Zhiheng, et al.
Published: (2024)
by: Xi, Zhiheng, et al.
Published: (2024)
Knowledge Graph Reasoning with Self-supervised Reinforcement Learning
by: Ma, Ying, et al.
Published: (2024)
by: Ma, Ying, et al.
Published: (2024)
Similar Items
-
From Entropy to Calibrated Uncertainty: Training Language Models to Reason About Uncertainty
by: Jenane, Azza, et al.
Published: (2026) -
Leveraging Self-Consistency for Data-Efficient Amortized Bayesian Inference
by: Schmitt, Marvin, et al.
Published: (2023) -
Tracing Uncertainty in Language Model "Reasoning"
by: Grünefeld, Nils, et al.
Published: (2026) -
Reinforcement Learning for Sequence Design Leveraging Protein Language Models
by: Subramanian, Jithendaraa, et al.
Published: (2024) -
Reasoning Language Model Inference Serving Unveiled: An Empirical Study
by: Li, Qi, et al.
Published: (2025)