Зміст: :: Library Catalog

Збережено в:

Бібліографічні деталі
Автор:	Kugelmass, Joe
Формат:	Recurso digital
Мова:	Англійська
Опубліковано:	Zenodo 2025
Предмети:	instability detection LLM alignment generative AI control theory cybernetics flux measurement entropy model safety dynamical systems large language models artificial intelligence LLMs
Онлайн доступ:	https://doi.org/10.5281/zenodo.17872999
Теги:	Додати тег Немає тегів, Будьте першим, хто поставить тег для цього запису!

Зміст:

<p>Large language models (LLMs) exhibit a fundamental tension: fluent outputs coexist with unpredictable instabilities—hallucinations, semantic drift, and oscillatory responses. Existing safety mechanisms evaluate outputs post-hoc but do not monitor the underlying information dynamics during generation. We introduce ΔK, an information-theoretic metric measuring divergence and boundary flux across multiple stochastic samples of a model's output. Built atop this metric, the Synthesized Oracle implements a feedback control system for LLMs that identifies unstable segments, removes oscillatory components, and reconstructs coherent responses through entropy-aware optimization.</p> <p>Across 188 evaluations on a 3-billion parameter model (cogito:3b), the system demonstrates: (1) reliable instability detection through flux dynamics, (2) consistent entropy reduction via J-score minimization (53.5% of cases requiring optimization), and (3) successful reconstruction of approximately half (53.1%) of chaotic responses through surgical oscillator removal. These results suggest that model safety can be reframed from behavioral policing to dynamical regulation—treating entropy not as a defect to eliminate but as a signal to guide.</p>

Схожі ресурси