Збережено в:
| Автор: | |
|---|---|
| Формат: | Recurso digital |
| Мова: | Англійська |
| Опубліковано: |
Zenodo
2025
|
| Предмети: | |
| Онлайн доступ: | https://doi.org/10.5281/zenodo.17872999 |
| Теги: |
Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
|
Зміст:
- <p>Large language models (LLMs) exhibit a fundamental tension: fluent outputs coexist with unpredictable instabilities—hallucinations, semantic drift, and oscillatory responses. Existing safety mechanisms evaluate outputs post-hoc but do not monitor the underlying information dynamics during generation. We introduce ΔK, an information-theoretic metric measuring divergence and boundary flux across multiple stochastic samples of a model's output. Built atop this metric, the Synthesized Oracle implements a feedback control system for LLMs that identifies unstable segments, removes oscillatory components, and reconstructs coherent responses through entropy-aware optimization.</p> <p>Across 188 evaluations on a 3-billion parameter model (cogito:3b), the system demonstrates: (1) reliable instability detection through flux dynamics, (2) consistent entropy reduction via J-score minimization (53.5% of cases requiring optimization), and (3) successful reconstruction of approximately half (53.1%) of chaotic responses through surgical oscillator removal. These results suggest that model safety can be reframed from behavioral policing to dynamical regulation—treating entropy not as a defect to eliminate but as a signal to guide.</p>