Saved in:
Bibliographic Details
Main Authors: Zhang, Zhipeng, He, Hongshun
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.21043
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914413495189504
author Zhang, Zhipeng
He, Hongshun
author_facet Zhang, Zhipeng
He, Hongshun
contents Humans must flexibly arbitrate between exploring alternatives and exploiting learned strategies, yet they frequently exhibit maladaptive persistence by continuing to execute failing strategies despite accumulating negative evidence. Here we propose a ``confidence-freeze'' account that reframes such persistence as a dynamic learning state rather than a stable dispositional trait. Using a multi-reversal two-armed bandit task across three experiments (total N = 332; 19,920 trials), we first show that human learners normally make use of the symmetric statistical structure inherent in outcome trajectories: runs of successes provide positive evidence for environmental stability and thus for strategy maintenance, whereas runs of failures provide negative evidence and should raise switching probability. Behaviour in the control group conformed to this normative pattern. However, individuals who experienced a high rate of early success (90\% vs.\ 60\%) displayed a robust and selective distortion after the first reversal: they persisted through long stretches of non-reward (mean = 6.2 consecutive losses) while their metacognitive confidence ratings simultaneously dropped from 5 to 2 on a 7-point scale.
format Preprint
id arxiv_https___arxiv_org_abs_2603_21043
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Confidence Freeze: Early Success Induces a Metastable Decoupling of Metacognition and Behaviour
Zhang, Zhipeng
He, Hongshun
Machine Learning
Humans must flexibly arbitrate between exploring alternatives and exploiting learned strategies, yet they frequently exhibit maladaptive persistence by continuing to execute failing strategies despite accumulating negative evidence. Here we propose a ``confidence-freeze'' account that reframes such persistence as a dynamic learning state rather than a stable dispositional trait. Using a multi-reversal two-armed bandit task across three experiments (total N = 332; 19,920 trials), we first show that human learners normally make use of the symmetric statistical structure inherent in outcome trajectories: runs of successes provide positive evidence for environmental stability and thus for strategy maintenance, whereas runs of failures provide negative evidence and should raise switching probability. Behaviour in the control group conformed to this normative pattern. However, individuals who experienced a high rate of early success (90\% vs.\ 60\%) displayed a robust and selective distortion after the first reversal: they persisted through long stretches of non-reward (mean = 6.2 consecutive losses) while their metacognitive confidence ratings simultaneously dropped from 5 to 2 on a 7-point scale.
title Confidence Freeze: Early Success Induces a Metastable Decoupling of Metacognition and Behaviour
topic Machine Learning
url https://arxiv.org/abs/2603.21043