Saved in:
| Main Author: | |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.01951 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866917467552481280 |
|---|---|
| author | Choi, Kang-Sin |
| author_facet | Choi, Kang-Sin |
| contents | We propose Autolearn, a framework that enables language models to learn from documents they read, with no external supervision. Passages that produce anomalously high per-token loss are flagged, verified through a self-generated Q&A chain, and trained on with conviction-proportional $β_2$ adjustment. We introduce the perturbation gap (paraphrase-to-original perplexity ratio) as a metric that distinguishes memorization from understanding. The key mechanism is the training data format: Q&A-format training drives the perturbation gap below the pre-trained baseline (2.098 vs. 2.204, $Δ= -0.106$, $> 10σ$), suppressing token-sequence memorization, while standard fine-tuning's best attempt remains within noise ($Δ= -0.010$, $< 1σ$). Across four models spanning Qwen3 and Phi-4 families, Autolearn is the only method that enters this regime. Stochastic evaluation reveals passage-specific knowledge acquisition: the probability of generating a correct novel fact rises from 6% to 54% after training ($p < 10^{-4}$), and Q&A format outperforms standard fine-tuning on genuinely novel facts. The system is self-extinguishing: learned content reduces surprisal below threshold and is skipped on re-encounter. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2604_01951 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Autolearn: Learn by Surprise, Commit by Proof Choi, Kang-Sin Machine Learning We propose Autolearn, a framework that enables language models to learn from documents they read, with no external supervision. Passages that produce anomalously high per-token loss are flagged, verified through a self-generated Q&A chain, and trained on with conviction-proportional $β_2$ adjustment. We introduce the perturbation gap (paraphrase-to-original perplexity ratio) as a metric that distinguishes memorization from understanding. The key mechanism is the training data format: Q&A-format training drives the perturbation gap below the pre-trained baseline (2.098 vs. 2.204, $Δ= -0.106$, $> 10σ$), suppressing token-sequence memorization, while standard fine-tuning's best attempt remains within noise ($Δ= -0.010$, $< 1σ$). Across four models spanning Qwen3 and Phi-4 families, Autolearn is the only method that enters this regime. Stochastic evaluation reveals passage-specific knowledge acquisition: the probability of generating a correct novel fact rises from 6% to 54% after training ($p < 10^{-4}$), and Q&A format outperforms standard fine-tuning on genuinely novel facts. The system is self-extinguishing: learned content reduces surprisal below threshold and is skipped on re-encounter. |
| title | Autolearn: Learn by Surprise, Commit by Proof |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2604.01951 |