:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	He, Haoyang, Rong, Zihua, Zhao, Liangjie, Zhao, Yunjia, Yang, Lan, Zhang, Honggang
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2603.03297
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Self-Trained Verification for Training- and Test-Time Self-Improvement
by: Wu, Chen Henry, et al.
Published: (2026)

Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models
by: Bazaga, Adrián, et al.
Published: (2025)

Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
by: Subramaniam, Vighnesh, et al.
Published: (2025)

Stepwise Self-Consistent Mathematical Reasoning with Large Language Models
by: Zhao, Zilong, et al.
Published: (2024)

Test-Time Scaling in Reasoning Models Is Not Effective for Knowledge-Intensive Tasks Yet
by: Zhao, James Xu, et al.
Published: (2025)

AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air
by: Yang, Shiyi, et al.
Published: (2025)

CoSPlay: Cooperative Self-Play at Test-Time with Self-Generated Code and Unit Test
by: Hu, Zhangyi, et al.
Published: (2026)

Method-Based Reasoning for Large Language Models: Extraction, Reuse, and Continuous Improvement
by: Su, Hong
Published: (2025)

Agent-Omni: Test-Time Multimodal Reasoning via Model Coordination for Understanding Anything
by: Lin, Huawei, et al.
Published: (2025)

Self-Improvement as Coherence Optimization: A Theoretical Account
by: Qiu, Tianyi, et al.
Published: (2026)

Bridging the Semantic Gap for Categorical Data Clustering via Large Language Models
by: Yang, Zihua, et al.
Published: (2026)

Think Before You Prune: Self-Reflective Structured Pruning for Reasoning Language Models
by: Wang, Ziyan, et al.
Published: (2025)

Crosslingual Reasoning through Test-Time Scaling
by: Yong, Zheng-Xin, et al.
Published: (2025)

Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision
by: Xi, Zhiheng, et al.
Published: (2024)

Adaptive Test-Time Reasoning via Reward-Guided Dual-Phase Search
by: Cui, Yingqian, et al.
Published: (2025)

Understanding and Mitigating Spurious Signal Amplification in Test-Time Reinforcement Learning for Math Reasoning
by: Yu, Yongcan, et al.
Published: (2026)

SenTSR-Bench: Thinking with Injected Knowledge for Time-Series Reasoning
by: He, Zelin, et al.
Published: (2026)

Learning to Refine: Self-Refinement of Parallel Reasoning in LLMs
by: Wang, Qibin, et al.
Published: (2025)

B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
by: Zeng, Weihao, et al.
Published: (2024)

Reinforcement Learning Teachers of Test Time Scaling
by: Cetin, Edoardo, et al.
Published: (2025)

TTCS: Test-Time Curriculum Synthesis for Self-Evolving
by: Yang, Chengyi, et al.
Published: (2026)

Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time
by: Zhang, Zhenyu, et al.
Published: (2025)

SABER: Switchable and Balanced Training for Efficient LLM Reasoning
by: Zhao, Kai, et al.
Published: (2025)

A Survey of Test-Time Compute: From Intuitive Inference to Deliberate Reasoning
by: Ji, Yixin, et al.
Published: (2025)

Learning to Reason from Feedback at Test-Time
by: Li, Yanyang, et al.
Published: (2025)

MaskTab: Scalable Masked Tabular Pretraining with Scaling Laws and Distillation for Industrial Classification
by: Zheng, Bo, et al.
Published: (2026)

Efficient Reasoning for Large Reasoning Language Models via Certainty-Guided Reflection Suppression
by: Huang, Jiameng, et al.
Published: (2025)

Vision-Language Models Can Self-Improve Reasoning via Reflection
by: Cheng, Kanzhi, et al.
Published: (2024)

Self-Improvement in Language Models: The Sharpening Mechanism
by: Huang, Audrey, et al.
Published: (2024)

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information
by: Shen, Guobin, et al.
Published: (2026)

Dynamic Experts Search: Enhancing Reasoning in Mixture-of-Experts LLMs at Test Time
by: Han, Yixuan, et al.
Published: (2025)

Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling
by: Ding, Yiwen, et al.
Published: (2024)

Self-Improving LLM Agents at Test-Time
by: Acikgoz, Emre Can, et al.
Published: (2025)

Log-Augmented Generation: Scaling Test-Time Reasoning with Reusable Computation
by: Chen, Peter Baile, et al.
Published: (2025)

Parallel Test-Time Scaling for Latent Reasoning Models
by: You, Runyang, et al.
Published: (2025)

Reasoning: From Reflection to Solution
by: Li, Zixi
Published: (2025)

TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning
by: Zou, Jiaru, et al.
Published: (2025)

In-Place Test-Time Training
by: Feng, Guhao, et al.
Published: (2026)

Absolute Zero: Reinforced Self-play Reasoning with Zero Data
by: Zhao, Andrew, et al.
Published: (2025)

Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
by: Xu, Fangzhi, et al.
Published: (2025)