:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Pu, Xiao, Saxon, Michael, Hua, Wenyue, Wang, William Yang
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2504.13367
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Mitigating Overthinking through Reasoning Shaping
by: Song, Feifan, et al.
Published: (2025)

Benchmarks as Microscopes: A Call for Model Metrology
by: Saxon, Michael, et al.
Published: (2024)

Mitigating Overthinking in Large Reasoning Language Models via Reasoning Path Deviation Monitoring
by: Guan, Weixin, et al.
Published: (2026)

Mitigating Overthinking in Large Reasoning Models via Manifold Steering
by: Huang, Yao, et al.
Published: (2025)

Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts
by: Sharma, Aditya, et al.
Published: (2024)

Precedent-Informed Reasoning: Mitigating Overthinking in Large Reasoning Models via Test-Time Precedent Learning
by: Wang, Qianyue, et al.
Published: (2026)

Batch Prompting Suppresses Overthinking Reasoning Under Constraint: How Batch Prompting Suppresses Overthinking in Reasoning Models
by: Srivastava, Saurabh, et al.
Published: (2025)

Do LLMs Overthink Basic Math Reasoning? Benchmarking the Accuracy-Efficiency Tradeoff in Language Models
by: Srivastava, Gaurav, et al.
Published: (2025)

RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios
by: Zhou, Ruiwen, et al.
Published: (2024)

ROM: Real-time Overthinking Mitigation via Streaming Detection and Intervention
by: Wang, Xinyan, et al.
Published: (2026)

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
by: Sui, Yang, et al.
Published: (2025)

NPHardEval: Dynamic Benchmark on Reasoning Ability of Large Language Models via Complexity Classes
by: Fan, Lizhou, et al.
Published: (2023)

Disentangling Memory and Reasoning Ability in Large Language Models
by: Jin, Mingyu, et al.
Published: (2024)

DRQA: Dynamic Reasoning Quota Allocation for Controlling Overthinking in Reasoning Large Language Models
by: Yan, Kaiwen, et al.
Published: (2025)

Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking
by: Han, Jinyi, et al.
Published: (2025)

Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning
by: Wang, Xinyi, et al.
Published: (2023)

BadReasoner: Planting Tunable Overthinking Backdoors into Large Reasoning Models for Fun or Profit
by: Yi, Biao, et al.
Published: (2025)

TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation
by: Feng, Weixi, et al.
Published: (2024)

MAGPIE: A dataset for Multi-AGent contextual PrIvacy Evaluation
by: Juneja, Gurusha, et al.
Published: (2025)

Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning
by: Hassid, Michael, et al.
Published: (2025)

Do You Know About My Nation? Investigating Multilingual Language Models' Cultural Literacy Through Factual Knowledge
by: Tanwar, Eshaan, et al.
Published: (2025)

Think, But Don't Overthink: Reproducing Recursive Language Models
by: Wang, Daren
Published: (2026)

Reasoning or Overthinking: Evaluating Large Language Models on Financial Sentiment Analysis
by: Vamvourellis, Dimitris, et al.
Published: (2025)

Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation
by: Bin, Yi, et al.
Published: (2025)

Addressing Overthinking in Large Vision-Language Models via Gated Perception-Reasoning Optimization
by: Diao, Xingjian, et al.
Published: (2026)

Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts
by: Saxon, Michael, et al.
Published: (2024)

Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
by: Fan, Chenrui, et al.
Published: (2025)

The Evolution of Thought: Tracking LLM Overthinking via Reasoning Dynamics Analysis
by: Wei, Zihao, et al.
Published: (2025)

Between Underthinking and Overthinking: An Empirical Study of Reasoning Length and correctness in LLMs
by: Su, Jinyan, et al.
Published: (2025)

MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate
by: Amayuelas, Alfonso, et al.
Published: (2024)

Don't "Overthink" Passage Reranking: Is Reasoning Truly Necessary?
by: Jedidi, Nour, et al.
Published: (2025)

VSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for VLMs
by: Wu, Qiucheng, et al.
Published: (2024)

Hop, Skip, and Overthink: Diagnosing Why Reasoning Models Fumble during Multi-Hop Analysis
by: Yadav, Anushka, et al.
Published: (2025)

Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2)
by: Saxon, Michael, et al.
Published: (2024)

The Impact of Reasoning Step Length on Large Language Models
by: Jin, Mingyu, et al.
Published: (2024)

Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling
by: Jiang, Shuyang, et al.
Published: (2025)

InductionBench: LLMs Fail in the Simplest Complexity Class
by: Hua, Wenyue, et al.
Published: (2025)

MAGPIE: A benchmark for Multi-AGent contextual PrIvacy Evaluation
by: Juneja, Gurusha, et al.
Published: (2025)

REALM: A Dataset of Real-World LLM Use Cases
by: Cheng, Jingwen, et al.
Published: (2025)

Propagation and Pitfalls: Reasoning-based Assessment of Knowledge Editing through Counterfactual Tasks
by: Hua, Wenyue, et al.
Published: (2024)