Saved in:
| Main Authors: | , , , , , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.04496 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866912876982173696 |
|---|---|
| author | Tang, Zhentao Cui, Yuqi Kai, Shixiong Zhao, Wenqian Ye, Ke Li, Xing Tian, Anxin Pei, Zehua Zhen, Hui-Ling Hu, Shoubo Li, Xiaoguang Wang, Yunhe Yuan, Mingxuan |
| author_facet | Tang, Zhentao Cui, Yuqi Kai, Shixiong Zhao, Wenqian Ye, Ke Li, Xing Tian, Anxin Pei, Zehua Zhen, Hui-Ling Hu, Shoubo Li, Xiaoguang Wang, Yunhe Yuan, Mingxuan |
| contents | Expert-level scientific reasoning remains challenging for large language models, particularly on benchmarks such as Humanity's Last Exam (HLE), where rigid tool pipelines, brittle multi-agent coordination, and inefficient test-time scaling often limit performance. We introduce ReThinker, a confidence-aware agentic framework that orchestrates retrieval, tool use, and multi-agent reasoning through a stage-wise Solver-Critic-Selector architecture. Rather than following a fixed pipeline, ReThinker dynamically allocates computation based on model confidence, enabling adaptive tool invocation, guided multi-dimensional reflection, and robust confidence-weighted selection. To support scalable training without human annotation, we further propose a reverse data synthesis pipeline and an adaptive trajectory recycling strategy that transform successful reasoning traces into high-quality supervision. Experiments on HLE, GAIA, and XBench demonstrate that ReThinker consistently outperforms state-of-the-art foundation models with tools and existing deep research systems, achieving state-of-the-art results on expert-level reasoning tasks. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2602_04496 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control Tang, Zhentao Cui, Yuqi Kai, Shixiong Zhao, Wenqian Ye, Ke Li, Xing Tian, Anxin Pei, Zehua Zhen, Hui-Ling Hu, Shoubo Li, Xiaoguang Wang, Yunhe Yuan, Mingxuan Artificial Intelligence Expert-level scientific reasoning remains challenging for large language models, particularly on benchmarks such as Humanity's Last Exam (HLE), where rigid tool pipelines, brittle multi-agent coordination, and inefficient test-time scaling often limit performance. We introduce ReThinker, a confidence-aware agentic framework that orchestrates retrieval, tool use, and multi-agent reasoning through a stage-wise Solver-Critic-Selector architecture. Rather than following a fixed pipeline, ReThinker dynamically allocates computation based on model confidence, enabling adaptive tool invocation, guided multi-dimensional reflection, and robust confidence-weighted selection. To support scalable training without human annotation, we further propose a reverse data synthesis pipeline and an adaptive trajectory recycling strategy that transform successful reasoning traces into high-quality supervision. Experiments on HLE, GAIA, and XBench demonstrate that ReThinker consistently outperforms state-of-the-art foundation models with tools and existing deep research systems, achieving state-of-the-art results on expert-level reasoning tasks. |
| title | ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control |
| topic | Artificial Intelligence |
| url | https://arxiv.org/abs/2602.04496 |