:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Metel, Michael R., Cui, Yufei, Chen, Boxing, Parthasarathi, Prasanna
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2601.09855
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Nested-ReFT: Efficient Reinforcement Learning for Large Language Model Fine-Tuning via Off-Policy Rollouts
by: Heuillet, Maxime, et al.
Published: (2025)

Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
by: Huang, Jerry, et al.
Published: (2024)

GRPO-$λ$: Credit Assignment improves LLM Reasoning
by: Parthasarathi, Prasanna, et al.
Published: (2025)

Do Large Language Models Know How Much They Know?
by: Prato, Gabriele, et al.
Published: (2025)

MatryoshkaThinking: Recursive Test-Time Scaling Enables Efficient Reasoning
by: Chen, Hongwei, et al.
Published: (2025)

Batch-Max: Higher LLM Throughput using Larger Batch Sizes and KV Cache Compression
by: Metel, Michael R., et al.
Published: (2024)

Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning
by: Yang, Wenkai, et al.
Published: (2025)

Does Thinking More always Help? Mirage of Test-Time Scaling in Reasoning Models
by: Ghosal, Soumya Suvra, et al.
Published: (2025)

Beyond Mode-Seeking RL: Trajectory-Balance Post-Training for Diffusion Language Models
by: Ahmadi, Saba, et al.
Published: (2026)

Towards Practical Tool Usage for Continually Learning LLMs
by: Huang, Jerry, et al.
Published: (2024)

Train Long, Think Short: Curriculum Learning for Efficient Reasoning
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2025)

Resona: Improving Context Copying in Linear Recurrence Models with Retrieval
by: Wang, Xinyu, et al.
Published: (2025)

LongReasonArena: A Long Reasoning Benchmark for Large Language Models
by: Ding, Jiayu, et al.
Published: (2025)

Scaling over Scaling: Exploring Test-Time Scaling Plateau in Large Reasoning Models
by: Wang, Jian, et al.
Published: (2025)

TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning
by: Zou, Jiaru, et al.
Published: (2025)

A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well?
by: Zhang, Qiyuan, et al.
Published: (2025)

InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models
by: Yan, Yuchen, et al.
Published: (2025)

Modeling Hierarchical Thinking in Large Reasoning Models
by: Shahariar, G M, et al.
Published: (2025)

m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models
by: Huang, Xiaoke, et al.
Published: (2025)

Logical Reasoning with Outcome Reward Models for Test-Time Scaling
by: Thatikonda, Ramya Keerthy, et al.
Published: (2025)

Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time
by: Yang, Wang, et al.
Published: (2025)

To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks
by: Gong, Nanxu, et al.
Published: (2026)

Precedent-Informed Reasoning: Mitigating Overthinking in Large Reasoning Models via Test-Time Precedent Learning
by: Wang, Qianyue, et al.
Published: (2026)

Dynamic Thinking-Token Selection for Efficient Reasoning in Large Reasoning Models
by: Guo, Zhenyuan, et al.
Published: (2026)

Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation
by: Tang, Jiakai, et al.
Published: (2025)

Parallel Test-Time Scaling for Latent Reasoning Models
by: You, Runyang, et al.
Published: (2025)

Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization
by: Ye, Wengao, et al.
Published: (2025)

Think Again! The Effect of Test-Time Compute on Preferences, Opinions, and Beliefs of Large Language Models
by: Kour, George, et al.
Published: (2025)

Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning
by: Li, Chengzu, et al.
Published: (2026)

Lissard: Long and Simple Sequential Reasoning Datasets
by: Bueno, Mirelle, et al.
Published: (2024)

PoTPTQ: A Two-step Power-of-Two Post-training for LLMs
by: Wang, Xinyu, et al.
Published: (2025)

To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models
by: Zhu, Zihao, et al.
Published: (2025)

Draft on the Fly: Adaptive Self-Speculative Decoding using Cosine Similarity
by: Metel, Michael R., et al.
Published: (2024)

Controlling Thinking Speed in Reasoning Models
by: Lin, Zhengkai, et al.
Published: (2025)

Log-Augmented Generation: Scaling Test-Time Reasoning with Reusable Computation
by: Chen, Peter Baile, et al.
Published: (2025)

ReProbe: Efficient Test-Time Scaling of Multi-Step Reasoning by Probing Internal States of Large Language Models
by: Ni, Jingwei, et al.
Published: (2025)

Think$^{2}$: Grounded Metacognitive Reasoning in Large Language Models
by: Elenjical, Abraham Paul, et al.
Published: (2026)

Exploring the System 1 Thinking Capability of Large Reasoning Models
by: Zhang, Wenyuan, et al.
Published: (2025)

Bridging the Reasoning Gap in Vietnamese with Small Language Models via Test-Time Scaling
by: Trung, Bui The, et al.
Published: (2026)

Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model
by: Ding, Bowen, et al.
Published: (2025)