Saved in:
Bibliographic Details
Main Authors: Tang, Zhentao, Cui, Yuqi, Kai, Shixiong, Zhao, Wenqian, Ye, Ke, Li, Xing, Tian, Anxin, Pei, Zehua, Zhen, Hui-Ling, Hu, Shoubo, Li, Xiaoguang, Wang, Yunhe, Yuan, Mingxuan
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.04496
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912876982173696
author Tang, Zhentao
Cui, Yuqi
Kai, Shixiong
Zhao, Wenqian
Ye, Ke
Li, Xing
Tian, Anxin
Pei, Zehua
Zhen, Hui-Ling
Hu, Shoubo
Li, Xiaoguang
Wang, Yunhe
Yuan, Mingxuan
author_facet Tang, Zhentao
Cui, Yuqi
Kai, Shixiong
Zhao, Wenqian
Ye, Ke
Li, Xing
Tian, Anxin
Pei, Zehua
Zhen, Hui-Ling
Hu, Shoubo
Li, Xiaoguang
Wang, Yunhe
Yuan, Mingxuan
contents Expert-level scientific reasoning remains challenging for large language models, particularly on benchmarks such as Humanity's Last Exam (HLE), where rigid tool pipelines, brittle multi-agent coordination, and inefficient test-time scaling often limit performance. We introduce ReThinker, a confidence-aware agentic framework that orchestrates retrieval, tool use, and multi-agent reasoning through a stage-wise Solver-Critic-Selector architecture. Rather than following a fixed pipeline, ReThinker dynamically allocates computation based on model confidence, enabling adaptive tool invocation, guided multi-dimensional reflection, and robust confidence-weighted selection. To support scalable training without human annotation, we further propose a reverse data synthesis pipeline and an adaptive trajectory recycling strategy that transform successful reasoning traces into high-quality supervision. Experiments on HLE, GAIA, and XBench demonstrate that ReThinker consistently outperforms state-of-the-art foundation models with tools and existing deep research systems, achieving state-of-the-art results on expert-level reasoning tasks.
format Preprint
id arxiv_https___arxiv_org_abs_2602_04496
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control
Tang, Zhentao
Cui, Yuqi
Kai, Shixiong
Zhao, Wenqian
Ye, Ke
Li, Xing
Tian, Anxin
Pei, Zehua
Zhen, Hui-Ling
Hu, Shoubo
Li, Xiaoguang
Wang, Yunhe
Yuan, Mingxuan
Artificial Intelligence
Expert-level scientific reasoning remains challenging for large language models, particularly on benchmarks such as Humanity's Last Exam (HLE), where rigid tool pipelines, brittle multi-agent coordination, and inefficient test-time scaling often limit performance. We introduce ReThinker, a confidence-aware agentic framework that orchestrates retrieval, tool use, and multi-agent reasoning through a stage-wise Solver-Critic-Selector architecture. Rather than following a fixed pipeline, ReThinker dynamically allocates computation based on model confidence, enabling adaptive tool invocation, guided multi-dimensional reflection, and robust confidence-weighted selection. To support scalable training without human annotation, we further propose a reverse data synthesis pipeline and an adaptive trajectory recycling strategy that transform successful reasoning traces into high-quality supervision. Experiments on HLE, GAIA, and XBench demonstrate that ReThinker consistently outperforms state-of-the-art foundation models with tools and existing deep research systems, achieving state-of-the-art results on expert-level reasoning tasks.
title ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control
topic Artificial Intelligence
url https://arxiv.org/abs/2602.04496