Salvato in:
Dettagli Bibliografici
Autore principale: Howland, Geoffrey
Natura: Recurso digital
Lingua:inglese
Pubblicazione: Zenodo 2026
Soggetti:
Accesso online:https://doi.org/10.5281/zenodo.20236522
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
Sommario:
  • <div> <div>The language model industry frames alignment as making models helpful, harmless, and honest through behavioral training — reinforcement learning from human feedback, constitutional AI methods, system prompts, and content classifiers. This paper demonstrates that all three alignment properties are achievable through architectural structure rather than behavioral training, using the VDR-LLM-Prolog system specified in the preceding sixteen papers. Honest becomes structural when every value carries provenance, every computation is reproducible, every confidence is a computed fraction with a visible derivation chain, and the user can inspect and download any of it. Harmless becomes structural when unauthorized data is absent by construction through knowledge base visibility and scope chain resolution, operational commands require positive credential grants, and content constraints operate on knowledge base provenance rather than token similarity. Helpful becomes structural when the model does what the user asked on data matched to the user's verified competence level, without assessing the user's state, substituting its judgment, or deploying unsolicited concern. The system supports tiered access through professional credential verification — a single fact assertion on a user's knowledge base, checked by integer comparison in the primitive layer, unlocking domain-specific knowledge bases that shadow general-public data in the scope chain. A credentialed professional chemist receives professional-grade toxicology data on the first query. An anonymous public user receives general educational information from the same system. Neither receives a wellness check, a refusal, or an assessment of their intent, because the architecture handles safety without needing to assess anyone. This paper introduces no new primitives, struct fields, builtins, or modules. Every mechanism uses existing components from VDR-1 through VDR-16. Alignment is not a feature added to this system. It is a consequence of how the system was built.</div> </div>