Saved in:
| Main Authors: | He, Haoyang, Rong, Zihua, Zhao, Liangjie, Zhao, Yunjia, Yang, Lan, Zhang, Honggang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.03297 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Self-Trained Verification for Training- and Test-Time Self-Improvement
by: Wu, Chen Henry, et al.
Published: (2026)
by: Wu, Chen Henry, et al.
Published: (2026)
Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models
by: Bazaga, Adrián, et al.
Published: (2025)
by: Bazaga, Adrián, et al.
Published: (2025)
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
by: Subramaniam, Vighnesh, et al.
Published: (2025)
by: Subramaniam, Vighnesh, et al.
Published: (2025)
Stepwise Self-Consistent Mathematical Reasoning with Large Language Models
by: Zhao, Zilong, et al.
Published: (2024)
by: Zhao, Zilong, et al.
Published: (2024)
Test-Time Scaling in Reasoning Models Is Not Effective for Knowledge-Intensive Tasks Yet
by: Zhao, James Xu, et al.
Published: (2025)
by: Zhao, James Xu, et al.
Published: (2025)
AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air
by: Yang, Shiyi, et al.
Published: (2025)
by: Yang, Shiyi, et al.
Published: (2025)
CoSPlay: Cooperative Self-Play at Test-Time with Self-Generated Code and Unit Test
by: Hu, Zhangyi, et al.
Published: (2026)
by: Hu, Zhangyi, et al.
Published: (2026)
Method-Based Reasoning for Large Language Models: Extraction, Reuse, and Continuous Improvement
by: Su, Hong
Published: (2025)
by: Su, Hong
Published: (2025)
Agent-Omni: Test-Time Multimodal Reasoning via Model Coordination for Understanding Anything
by: Lin, Huawei, et al.
Published: (2025)
by: Lin, Huawei, et al.
Published: (2025)
Self-Improvement as Coherence Optimization: A Theoretical Account
by: Qiu, Tianyi, et al.
Published: (2026)
by: Qiu, Tianyi, et al.
Published: (2026)
Bridging the Semantic Gap for Categorical Data Clustering via Large Language Models
by: Yang, Zihua, et al.
Published: (2026)
by: Yang, Zihua, et al.
Published: (2026)
Think Before You Prune: Self-Reflective Structured Pruning for Reasoning Language Models
by: Wang, Ziyan, et al.
Published: (2025)
by: Wang, Ziyan, et al.
Published: (2025)
Crosslingual Reasoning through Test-Time Scaling
by: Yong, Zheng-Xin, et al.
Published: (2025)
by: Yong, Zheng-Xin, et al.
Published: (2025)
Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision
by: Xi, Zhiheng, et al.
Published: (2024)
by: Xi, Zhiheng, et al.
Published: (2024)
Adaptive Test-Time Reasoning via Reward-Guided Dual-Phase Search
by: Cui, Yingqian, et al.
Published: (2025)
by: Cui, Yingqian, et al.
Published: (2025)
Understanding and Mitigating Spurious Signal Amplification in Test-Time Reinforcement Learning for Math Reasoning
by: Yu, Yongcan, et al.
Published: (2026)
by: Yu, Yongcan, et al.
Published: (2026)
SenTSR-Bench: Thinking with Injected Knowledge for Time-Series Reasoning
by: He, Zelin, et al.
Published: (2026)
by: He, Zelin, et al.
Published: (2026)
Learning to Refine: Self-Refinement of Parallel Reasoning in LLMs
by: Wang, Qibin, et al.
Published: (2025)
by: Wang, Qibin, et al.
Published: (2025)
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
by: Zeng, Weihao, et al.
Published: (2024)
by: Zeng, Weihao, et al.
Published: (2024)
Reinforcement Learning Teachers of Test Time Scaling
by: Cetin, Edoardo, et al.
Published: (2025)
by: Cetin, Edoardo, et al.
Published: (2025)
TTCS: Test-Time Curriculum Synthesis for Self-Evolving
by: Yang, Chengyi, et al.
Published: (2026)
by: Yang, Chengyi, et al.
Published: (2026)
Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time
by: Zhang, Zhenyu, et al.
Published: (2025)
by: Zhang, Zhenyu, et al.
Published: (2025)
SABER: Switchable and Balanced Training for Efficient LLM Reasoning
by: Zhao, Kai, et al.
Published: (2025)
by: Zhao, Kai, et al.
Published: (2025)
A Survey of Test-Time Compute: From Intuitive Inference to Deliberate Reasoning
by: Ji, Yixin, et al.
Published: (2025)
by: Ji, Yixin, et al.
Published: (2025)
Learning to Reason from Feedback at Test-Time
by: Li, Yanyang, et al.
Published: (2025)
by: Li, Yanyang, et al.
Published: (2025)
MaskTab: Scalable Masked Tabular Pretraining with Scaling Laws and Distillation for Industrial Classification
by: Zheng, Bo, et al.
Published: (2026)
by: Zheng, Bo, et al.
Published: (2026)
Efficient Reasoning for Large Reasoning Language Models via Certainty-Guided Reflection Suppression
by: Huang, Jiameng, et al.
Published: (2025)
by: Huang, Jiameng, et al.
Published: (2025)
Vision-Language Models Can Self-Improve Reasoning via Reflection
by: Cheng, Kanzhi, et al.
Published: (2024)
by: Cheng, Kanzhi, et al.
Published: (2024)
Self-Improvement in Language Models: The Sharpening Mechanism
by: Huang, Audrey, et al.
Published: (2024)
by: Huang, Audrey, et al.
Published: (2024)
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information
by: Shen, Guobin, et al.
Published: (2026)
by: Shen, Guobin, et al.
Published: (2026)
Dynamic Experts Search: Enhancing Reasoning in Mixture-of-Experts LLMs at Test Time
by: Han, Yixuan, et al.
Published: (2025)
by: Han, Yixuan, et al.
Published: (2025)
Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling
by: Ding, Yiwen, et al.
Published: (2024)
by: Ding, Yiwen, et al.
Published: (2024)
Self-Improving LLM Agents at Test-Time
by: Acikgoz, Emre Can, et al.
Published: (2025)
by: Acikgoz, Emre Can, et al.
Published: (2025)
Log-Augmented Generation: Scaling Test-Time Reasoning with Reusable Computation
by: Chen, Peter Baile, et al.
Published: (2025)
by: Chen, Peter Baile, et al.
Published: (2025)
Parallel Test-Time Scaling for Latent Reasoning Models
by: You, Runyang, et al.
Published: (2025)
by: You, Runyang, et al.
Published: (2025)
Reasoning: From Reflection to Solution
by: Li, Zixi
Published: (2025)
by: Li, Zixi
Published: (2025)
TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning
by: Zou, Jiaru, et al.
Published: (2025)
by: Zou, Jiaru, et al.
Published: (2025)
In-Place Test-Time Training
by: Feng, Guhao, et al.
Published: (2026)
by: Feng, Guhao, et al.
Published: (2026)
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
by: Zhao, Andrew, et al.
Published: (2025)
by: Zhao, Andrew, et al.
Published: (2025)
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
by: Xu, Fangzhi, et al.
Published: (2025)
by: Xu, Fangzhi, et al.
Published: (2025)
Similar Items
-
Self-Trained Verification for Training- and Test-Time Self-Improvement
by: Wu, Chen Henry, et al.
Published: (2026) -
Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models
by: Bazaga, Adrián, et al.
Published: (2025) -
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
by: Subramaniam, Vighnesh, et al.
Published: (2025) -
Stepwise Self-Consistent Mathematical Reasoning with Large Language Models
by: Zhao, Zilong, et al.
Published: (2024) -
Test-Time Scaling in Reasoning Models Is Not Effective for Knowledge-Intensive Tasks Yet
by: Zhao, James Xu, et al.
Published: (2025)