Saved in:
Bibliographic Details
Main Authors: Liu, Liping, Zhang, Chunhong, Wu, Likang, Zhao, Chuang, Hu, Zheng, He, Ming, Fan, Jianping
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.00902
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Self-reflection for Large Language Models (LLMs) has gained significant attention. Existing approaches involve models iterating and improving their previous responses based on LLMs' internal reflection ability or external feedback. However, recent research has raised doubts about whether intrinsic self-correction without external feedback may even degrade performance. Based on our empirical evidence, we find that current static reflection methods may lead to redundant, drift, and stubborn issues. To mitigate this, we introduce Instruct-of-Reflection (IoRT), a novel and general reflection framework that leverages dynamic-meta instruction to enhance the iterative reflection capability of LLMs. Specifically, we propose the instructor driven by the meta-thoughts and self-consistency classifier, generates various instructions, including refresh, stop, and select, to guide the next reflection iteration. Our experiments demonstrate that IoRT achieves an average improvement of 10.1% over established baselines in mathematical and commonsense reasoning tasks, highlighting its efficacy and applicability.