Saved in:
| Main Authors: | Metel, Michael R., Cui, Yufei, Chen, Boxing, Parthasarathi, Prasanna |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.09855 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Nested-ReFT: Efficient Reinforcement Learning for Large Language Model Fine-Tuning via Off-Policy Rollouts
by: Heuillet, Maxime, et al.
Published: (2025)
by: Heuillet, Maxime, et al.
Published: (2025)
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
by: Huang, Jerry, et al.
Published: (2024)
by: Huang, Jerry, et al.
Published: (2024)
GRPO-$λ$: Credit Assignment improves LLM Reasoning
by: Parthasarathi, Prasanna, et al.
Published: (2025)
by: Parthasarathi, Prasanna, et al.
Published: (2025)
Do Large Language Models Know How Much They Know?
by: Prato, Gabriele, et al.
Published: (2025)
by: Prato, Gabriele, et al.
Published: (2025)
MatryoshkaThinking: Recursive Test-Time Scaling Enables Efficient Reasoning
by: Chen, Hongwei, et al.
Published: (2025)
by: Chen, Hongwei, et al.
Published: (2025)
Batch-Max: Higher LLM Throughput using Larger Batch Sizes and KV Cache Compression
by: Metel, Michael R., et al.
Published: (2024)
by: Metel, Michael R., et al.
Published: (2024)
Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning
by: Yang, Wenkai, et al.
Published: (2025)
by: Yang, Wenkai, et al.
Published: (2025)
Does Thinking More always Help? Mirage of Test-Time Scaling in Reasoning Models
by: Ghosal, Soumya Suvra, et al.
Published: (2025)
by: Ghosal, Soumya Suvra, et al.
Published: (2025)
Beyond Mode-Seeking RL: Trajectory-Balance Post-Training for Diffusion Language Models
by: Ahmadi, Saba, et al.
Published: (2026)
by: Ahmadi, Saba, et al.
Published: (2026)
Towards Practical Tool Usage for Continually Learning LLMs
by: Huang, Jerry, et al.
Published: (2024)
by: Huang, Jerry, et al.
Published: (2024)
Train Long, Think Short: Curriculum Learning for Efficient Reasoning
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2025)
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2025)
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval
by: Wang, Xinyu, et al.
Published: (2025)
by: Wang, Xinyu, et al.
Published: (2025)
LongReasonArena: A Long Reasoning Benchmark for Large Language Models
by: Ding, Jiayu, et al.
Published: (2025)
by: Ding, Jiayu, et al.
Published: (2025)
Scaling over Scaling: Exploring Test-Time Scaling Plateau in Large Reasoning Models
by: Wang, Jian, et al.
Published: (2025)
by: Wang, Jian, et al.
Published: (2025)
TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning
by: Zou, Jiaru, et al.
Published: (2025)
by: Zou, Jiaru, et al.
Published: (2025)
A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well?
by: Zhang, Qiyuan, et al.
Published: (2025)
by: Zhang, Qiyuan, et al.
Published: (2025)
InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models
by: Yan, Yuchen, et al.
Published: (2025)
by: Yan, Yuchen, et al.
Published: (2025)
Modeling Hierarchical Thinking in Large Reasoning Models
by: Shahariar, G M, et al.
Published: (2025)
by: Shahariar, G M, et al.
Published: (2025)
m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models
by: Huang, Xiaoke, et al.
Published: (2025)
by: Huang, Xiaoke, et al.
Published: (2025)
Logical Reasoning with Outcome Reward Models for Test-Time Scaling
by: Thatikonda, Ramya Keerthy, et al.
Published: (2025)
by: Thatikonda, Ramya Keerthy, et al.
Published: (2025)
Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time
by: Yang, Wang, et al.
Published: (2025)
by: Yang, Wang, et al.
Published: (2025)
To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks
by: Gong, Nanxu, et al.
Published: (2026)
by: Gong, Nanxu, et al.
Published: (2026)
Precedent-Informed Reasoning: Mitigating Overthinking in Large Reasoning Models via Test-Time Precedent Learning
by: Wang, Qianyue, et al.
Published: (2026)
by: Wang, Qianyue, et al.
Published: (2026)
Dynamic Thinking-Token Selection for Efficient Reasoning in Large Reasoning Models
by: Guo, Zhenyuan, et al.
Published: (2026)
by: Guo, Zhenyuan, et al.
Published: (2026)
Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation
by: Tang, Jiakai, et al.
Published: (2025)
by: Tang, Jiakai, et al.
Published: (2025)
Parallel Test-Time Scaling for Latent Reasoning Models
by: You, Runyang, et al.
Published: (2025)
by: You, Runyang, et al.
Published: (2025)
Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization
by: Ye, Wengao, et al.
Published: (2025)
by: Ye, Wengao, et al.
Published: (2025)
Think Again! The Effect of Test-Time Compute on Preferences, Opinions, and Beliefs of Large Language Models
by: Kour, George, et al.
Published: (2025)
by: Kour, George, et al.
Published: (2025)
Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning
by: Li, Chengzu, et al.
Published: (2026)
by: Li, Chengzu, et al.
Published: (2026)
Lissard: Long and Simple Sequential Reasoning Datasets
by: Bueno, Mirelle, et al.
Published: (2024)
by: Bueno, Mirelle, et al.
Published: (2024)
PoTPTQ: A Two-step Power-of-Two Post-training for LLMs
by: Wang, Xinyu, et al.
Published: (2025)
by: Wang, Xinyu, et al.
Published: (2025)
To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models
by: Zhu, Zihao, et al.
Published: (2025)
by: Zhu, Zihao, et al.
Published: (2025)
Draft on the Fly: Adaptive Self-Speculative Decoding using Cosine Similarity
by: Metel, Michael R., et al.
Published: (2024)
by: Metel, Michael R., et al.
Published: (2024)
Controlling Thinking Speed in Reasoning Models
by: Lin, Zhengkai, et al.
Published: (2025)
by: Lin, Zhengkai, et al.
Published: (2025)
Log-Augmented Generation: Scaling Test-Time Reasoning with Reusable Computation
by: Chen, Peter Baile, et al.
Published: (2025)
by: Chen, Peter Baile, et al.
Published: (2025)
ReProbe: Efficient Test-Time Scaling of Multi-Step Reasoning by Probing Internal States of Large Language Models
by: Ni, Jingwei, et al.
Published: (2025)
by: Ni, Jingwei, et al.
Published: (2025)
Think$^{2}$: Grounded Metacognitive Reasoning in Large Language Models
by: Elenjical, Abraham Paul, et al.
Published: (2026)
by: Elenjical, Abraham Paul, et al.
Published: (2026)
Exploring the System 1 Thinking Capability of Large Reasoning Models
by: Zhang, Wenyuan, et al.
Published: (2025)
by: Zhang, Wenyuan, et al.
Published: (2025)
Bridging the Reasoning Gap in Vietnamese with Small Language Models via Test-Time Scaling
by: Trung, Bui The, et al.
Published: (2026)
by: Trung, Bui The, et al.
Published: (2026)
Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model
by: Ding, Bowen, et al.
Published: (2025)
by: Ding, Bowen, et al.
Published: (2025)
Similar Items
-
Nested-ReFT: Efficient Reinforcement Learning for Large Language Model Fine-Tuning via Off-Policy Rollouts
by: Heuillet, Maxime, et al.
Published: (2025) -
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
by: Huang, Jerry, et al.
Published: (2024) -
GRPO-$λ$: Credit Assignment improves LLM Reasoning
by: Parthasarathi, Prasanna, et al.
Published: (2025) -
Do Large Language Models Know How Much They Know?
by: Prato, Gabriele, et al.
Published: (2025) -
MatryoshkaThinking: Recursive Test-Time Scaling Enables Efficient Reasoning
by: Chen, Hongwei, et al.
Published: (2025)