Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wang, Zhuo, Zhang, Zhuo, Li, Yafu, Cheng, Yu, Qu, Lizhen, Xu, Zenglin
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2604.14768
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918450539003904
author	Wang, Zhuo Zhang, Zhuo Li, Yafu Cheng, Yu Qu, Lizhen Xu, Zenglin
author_facet	Wang, Zhuo Zhang, Zhuo Li, Yafu Cheng, Yu Qu, Lizhen Xu, Zenglin
contents	Large Language Models (LLMs) exhibit strong mathematical reasoning when trained on high-quality Chain-of-Thought (CoT) that articulates intermediate steps, yet costly CoT curation hinders further progress. While existing remedies such as distillation from stronger LLMs and self-synthesis based on test-time search alleviate this issue, they often suffer from diminishing returns or high computing overhead.In this work, we propose CoTEvol, a genetic evolutionary framework that casts CoT generation as a population-based search over reasoning trajectories.Candidate trajectories are iteratively evolved through reflective global crossover at the trajectory level and local mutation guided by uncertainty at the step level, enabling holistic recombination and fine-grained refinement. Lightweight, task-aware fitness functions are designed to guide the evolutionary process toward accurate and diverse reasoning. Empirically, CoTEvol improves correct-CoT synthesis success by over 30% and enhances structural diversity, with markedly improved efficiency. LLMs trained on these evolutionary CoT data achieve an average gain of 6.6% across eight math benchmarks, outperforming previous distillation and self-synthesis approaches. These results underscore the promise of evolutionary CoT synthesis as a scalable and effective method for mathematical reasoning tasks.
format	Preprint
id	arxiv_https___arxiv_org_abs_2604_14768
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	CoTEvol: Self-Evolving Chain-of-Thoughts for Data Synthesis in Mathematical Reasoning Wang, Zhuo Zhang, Zhuo Li, Yafu Cheng, Yu Qu, Lizhen Xu, Zenglin Artificial Intelligence Large Language Models (LLMs) exhibit strong mathematical reasoning when trained on high-quality Chain-of-Thought (CoT) that articulates intermediate steps, yet costly CoT curation hinders further progress. While existing remedies such as distillation from stronger LLMs and self-synthesis based on test-time search alleviate this issue, they often suffer from diminishing returns or high computing overhead.In this work, we propose CoTEvol, a genetic evolutionary framework that casts CoT generation as a population-based search over reasoning trajectories.Candidate trajectories are iteratively evolved through reflective global crossover at the trajectory level and local mutation guided by uncertainty at the step level, enabling holistic recombination and fine-grained refinement. Lightweight, task-aware fitness functions are designed to guide the evolutionary process toward accurate and diverse reasoning. Empirically, CoTEvol improves correct-CoT synthesis success by over 30% and enhances structural diversity, with markedly improved efficiency. LLMs trained on these evolutionary CoT data achieve an average gain of 6.6% across eight math benchmarks, outperforming previous distillation and self-synthesis approaches. These results underscore the promise of evolutionary CoT synthesis as a scalable and effective method for mathematical reasoning tasks.
title	CoTEvol: Self-Evolving Chain-of-Thoughts for Data Synthesis in Mathematical Reasoning
topic	Artificial Intelligence
url	https://arxiv.org/abs/2604.14768

Similar Items