Inhoudsopgave: :: Library Catalog

Bewaard in:

Bibliografische gegevens
Hoofdauteur:	Kugelmass, Joe
Formaat:	Recurso digital
Taal:	Engels
Gepubliceerd in:	Zenodo 2025
Onderwerpen:	instability detection LLM alignment generative AI control theory cybernetics flux measurement entropy model safety dynamical systems large language models artificial intelligence LLMs
Online toegang:	https://doi.org/10.5281/zenodo.17873096
Tags:	Voeg label toe Geen labels, Wees de eerste die dit record labelt!

Inhoudsopgave:

<p>Large language models (LLMs) exhibit a fundamental tension: fluent outputs coexist with unpredictable instabilities—hallucinations, semantic drift, and oscillatory responses. Existing safety mechanisms evaluate outputs post-hoc but do not monitor the underlying information dynamics during generation. We introduce ΔK, an information-theoretic metric measuring divergence and boundary flux across multiple stochastic samples of a model's output. Built atop this metric, the Synthesized Oracle implements a feedback control system for LLMs that identifies unstable segments, removes oscillatory components, and reconstructs coherent responses through entropy-aware optimization.</p> <p>Across 188 evaluations on a 3-billion parameter model (cogito:3b), the system demonstrates: (1) reliable instability detection through flux dynamics, (2) consistent entropy reduction via J-score minimization (53.5% of cases requiring optimization), and (3) successful reconstruction of approximately half (53.1%) of chaotic responses through surgical oscillator removal. These results suggest that model safety can be reframed from behavioral policing to dynamical regulation—treating entropy not as a defect to eliminate but as a signal to guide.</p>

Gelijkaardige items