Saved in:
Bibliographic Details
Main Authors: Zhang, Yong, Zhang, Bingyuan, Li, Zhitao, Li, Ming, Cheng, Ning, Chen, Minchuan, Wei, Tao, Ma, Jun, Wang, Shaojun, Xiao, Jing
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.12744
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916619344674816
author Zhang, Yong
Zhang, Bingyuan
Li, Zhitao
Li, Ming
Cheng, Ning
Chen, Minchuan
Wei, Tao
Ma, Jun
Wang, Shaojun
Xiao, Jing
author_facet Zhang, Yong
Zhang, Bingyuan
Li, Zhitao
Li, Ming
Cheng, Ning
Chen, Minchuan
Wei, Tao
Ma, Jun
Wang, Shaojun
Xiao, Jing
contents The rapid advancement of large language models (LLMs) has significantly enhanced their reasoning abilities, enabling increasingly complex tasks. However, these capabilities often diminish in smaller, more computationally efficient models like GPT-2. Recent research shows that reasoning distillation can help small models acquire reasoning capabilities, but most existing methods focus primarily on improving teacher-generated reasoning paths. Our observations reveal that small models can generate high-quality reasoning paths during sampling, even without chain-of-thought prompting, though these paths are often latent due to their low probability under standard decoding strategies. To address this, we propose Self-Enhanced Reasoning Training (SERT), which activates and leverages latent reasoning capabilities in small models through self-training on filtered, self-generated reasoning paths under zero-shot conditions. Experiments using OpenAI's GPT-3.5 as the teacher model and GPT-2 models as the student models demonstrate that SERT enhances the reasoning abilities of small models, improving their performance in reasoning distillation.
format Preprint
id arxiv_https___arxiv_org_abs_2502_12744
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Self-Enhanced Reasoning Training: Activating Latent Reasoning in Small Models for Enhanced Reasoning Distillation
Zhang, Yong
Zhang, Bingyuan
Li, Zhitao
Li, Ming
Cheng, Ning
Chen, Minchuan
Wei, Tao
Ma, Jun
Wang, Shaojun
Xiao, Jing
Computation and Language
The rapid advancement of large language models (LLMs) has significantly enhanced their reasoning abilities, enabling increasingly complex tasks. However, these capabilities often diminish in smaller, more computationally efficient models like GPT-2. Recent research shows that reasoning distillation can help small models acquire reasoning capabilities, but most existing methods focus primarily on improving teacher-generated reasoning paths. Our observations reveal that small models can generate high-quality reasoning paths during sampling, even without chain-of-thought prompting, though these paths are often latent due to their low probability under standard decoding strategies. To address this, we propose Self-Enhanced Reasoning Training (SERT), which activates and leverages latent reasoning capabilities in small models through self-training on filtered, self-generated reasoning paths under zero-shot conditions. Experiments using OpenAI's GPT-3.5 as the teacher model and GPT-2 models as the student models demonstrate that SERT enhances the reasoning abilities of small models, improving their performance in reasoning distillation.
title Self-Enhanced Reasoning Training: Activating Latent Reasoning in Small Models for Enhanced Reasoning Distillation
topic Computation and Language
url https://arxiv.org/abs/2502.12744