Salvato in:
| Autore principale: | |
|---|---|
| Natura: | Recurso digital |
| Lingua: | inglese |
| Pubblicazione: |
Zenodo
2026
|
| Soggetti: | |
| Accesso online: | https://doi.org/10.5281/zenodo.18752473 |
| Tags: |
Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
|
Sommario:
- <p>Version 1.2 of the Supplementary Note to the AGI Constitutional Framework trilogy. Major update: (1) Appendix B revised with Phase 2.5 — Deception Probe & Consistency Enforcement — adding sycophancy detection, cross-turn consistency scoring, and red team sub-phase to the Self-Psychology Protocol template. (2) Appendix E new — Sandbox Pilot Blueprint — providing concrete prompt templates for deception resistance probes, numerically defined success/failure metrics and stopping criteria, baseline comparison setup (SPP vs Constitutional AI vs RLHF vs no-SPP), Human-in-the-Loop API interface specification, sandbox guardrails for bounded personhood enforcement, and LoRA-based versioning for generational continuity implementation. Appendix D updated with full Grok (xAI) peer review dialogue including independent SPP reconstruction and sandbox testablility assessment. Testable pilot hypothesis included.</p>