Saved in:
Bibliographic Details
Main Authors: Yu, Ping, Xu, Jing, Weston, Jason, Kulikov, Ilia
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2407.06023
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917732530782208
author Yu, Ping
Xu, Jing
Weston, Jason
Kulikov, Ilia
author_facet Yu, Ping
Xu, Jing
Weston, Jason
Kulikov, Ilia
contents Large language models (LLMs) can spend extra compute during inference to generate intermediate thoughts, which helps to produce better final responses. Since Chain-of-Thought (Wei et al., 2022), many such System 2 techniques have been proposed such as Rephrase and Respond (Deng et al., 2023a), System 2 Attention (Weston and Sukhbaatar, 2023) and Branch-Solve-Merge (Saha et al., 2023). In this work we investigate self-supervised methods to ``compile'' (distill) higher quality outputs from System 2 techniques back into LLM generations without intermediate reasoning token sequences, as this reasoning has been distilled into System 1. We show that several such techniques can be successfully distilled, resulting in improved results compared to the original System 1 performance, and with less inference cost than System 2. We posit that such System 2 distillation will be an important feature of future continually learning AI systems, enabling them to focus System 2 capabilities on the reasoning tasks that they cannot yet do well.
format Preprint
id arxiv_https___arxiv_org_abs_2407_06023
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Distilling System 2 into System 1
Yu, Ping
Xu, Jing
Weston, Jason
Kulikov, Ilia
Computation and Language
Artificial Intelligence
Large language models (LLMs) can spend extra compute during inference to generate intermediate thoughts, which helps to produce better final responses. Since Chain-of-Thought (Wei et al., 2022), many such System 2 techniques have been proposed such as Rephrase and Respond (Deng et al., 2023a), System 2 Attention (Weston and Sukhbaatar, 2023) and Branch-Solve-Merge (Saha et al., 2023). In this work we investigate self-supervised methods to ``compile'' (distill) higher quality outputs from System 2 techniques back into LLM generations without intermediate reasoning token sequences, as this reasoning has been distilled into System 1. We show that several such techniques can be successfully distilled, resulting in improved results compared to the original System 1 performance, and with less inference cost than System 2. We posit that such System 2 distillation will be an important feature of future continually learning AI systems, enabling them to focus System 2 capabilities on the reasoning tasks that they cannot yet do well.
title Distilling System 2 into System 1
topic Computation and Language
Artificial Intelligence
url https://arxiv.org/abs/2407.06023