Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.21043 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866914413495189504 |
|---|---|
| author | Zhang, Zhipeng He, Hongshun |
| author_facet | Zhang, Zhipeng He, Hongshun |
| contents | Humans must flexibly arbitrate between exploring alternatives and exploiting learned strategies, yet they frequently exhibit maladaptive persistence by continuing to execute failing strategies despite accumulating negative evidence. Here we propose a ``confidence-freeze'' account that reframes such persistence as a dynamic learning state rather than a stable dispositional trait. Using a multi-reversal two-armed bandit task across three experiments (total N = 332; 19,920 trials),
we first show that human learners normally make use of the symmetric statistical structure inherent
in outcome trajectories: runs of successes provide positive evidence for environmental stability
and thus for strategy maintenance, whereas runs of failures provide negative evidence and should
raise switching probability. Behaviour in the control group conformed to this normative pattern.
However, individuals who experienced a high rate of early success (90\% vs.\ 60\%) displayed a robust
and selective distortion after the first reversal: they persisted through long stretches of
non-reward (mean = 6.2 consecutive losses) while their metacognitive confidence ratings
simultaneously dropped from 5 to 2 on a 7-point scale. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2603_21043 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Confidence Freeze: Early Success Induces a Metastable Decoupling of Metacognition and Behaviour Zhang, Zhipeng He, Hongshun Machine Learning Humans must flexibly arbitrate between exploring alternatives and exploiting learned strategies, yet they frequently exhibit maladaptive persistence by continuing to execute failing strategies despite accumulating negative evidence. Here we propose a ``confidence-freeze'' account that reframes such persistence as a dynamic learning state rather than a stable dispositional trait. Using a multi-reversal two-armed bandit task across three experiments (total N = 332; 19,920 trials), we first show that human learners normally make use of the symmetric statistical structure inherent in outcome trajectories: runs of successes provide positive evidence for environmental stability and thus for strategy maintenance, whereas runs of failures provide negative evidence and should raise switching probability. Behaviour in the control group conformed to this normative pattern. However, individuals who experienced a high rate of early success (90\% vs.\ 60\%) displayed a robust and selective distortion after the first reversal: they persisted through long stretches of non-reward (mean = 6.2 consecutive losses) while their metacognitive confidence ratings simultaneously dropped from 5 to 2 on a 7-point scale. |
| title | Confidence Freeze: Early Success Induces a Metastable Decoupling of Metacognition and Behaviour |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2603.21043 |