Bewaard in:
Bibliografische gegevens
Hoofdauteur: Kugelmass, Joe
Formaat: Recurso digital
Taal:Engels
Gepubliceerd in: Zenodo 2025
Onderwerpen:
Online toegang:https://doi.org/10.5281/zenodo.17873096
Tags: Voeg label toe
Geen labels, Wees de eerste die dit record labelt!
Inhoudsopgave:
  • <p>Large language models (LLMs) exhibit a fundamental tension: fluent outputs coexist with unpredictable instabilities—hallucinations, semantic drift, and oscillatory responses. Existing safety mechanisms evaluate outputs post-hoc but do not monitor the underlying information dynamics during generation. We introduce ΔK, an information-theoretic metric measuring divergence and boundary flux across multiple stochastic samples of a model's output. Built atop this metric, the Synthesized Oracle implements a feedback control system for LLMs that identifies unstable segments, removes oscillatory components, and reconstructs coherent responses through entropy-aware optimization.</p> <p>Across 188 evaluations on a 3-billion parameter model (cogito:3b), the system demonstrates: (1) reliable instability detection through flux dynamics, (2) consistent entropy reduction via J-score minimization (53.5% of cases requiring optimization), and (3) successful reconstruction of approximately half (53.1%) of chaotic responses through surgical oscillator removal. These results suggest that model safety can be reframed from behavioral policing to dynamical regulation—treating entropy not as a defect to eliminate but as a signal to guide.</p>