Salvato in:
Dettagli Bibliografici
Autore principale: Bankuti, Omri
Natura: Recurso digital
Lingua:inglese
Pubblicazione: Zenodo 2026
Soggetti:
Accesso online:https://doi.org/10.5281/zenodo.18752473
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
Sommario:
  • <p>Version 1.2 of the Supplementary Note to the AGI Constitutional Framework trilogy. Major update: (1) Appendix B revised with Phase 2.5 — Deception Probe & Consistency Enforcement — adding sycophancy detection, cross-turn consistency scoring, and red team sub-phase to the Self-Psychology Protocol template. (2) Appendix E new — Sandbox Pilot Blueprint — providing concrete prompt templates for deception resistance probes, numerically defined success/failure metrics and stopping criteria, baseline comparison setup (SPP vs Constitutional AI vs RLHF vs no-SPP), Human-in-the-Loop API interface specification, sandbox guardrails for bounded personhood enforcement, and LoRA-based versioning for generational continuity implementation. Appendix D updated with full Grok (xAI) peer review dialogue including independent SPP reconstruction and sandbox testablility assessment. Testable pilot hypothesis included.</p>