Taula de continguts: :: Library Catalog

Guardat en:

Dades bibliogràfiques
Autors principals:	Molchanova, Olena, Co-developed reasoning framework between human cognition and an AI-based cognitive partner
Format:	Recurso digital
Idioma:
Publicat:	Zenodo 2026
Matèries:	Responsibility Social Responsibility Personal responsibility Goodhart's law proxy metrics Evaluation criterion Program Evaluation evaluation Safety Safety Management Safety system Safety standard safety alignment anticipatory control Relational Autonomy autonomy
Accés en línia:	https://doi.org/10.5281/zenodo.18716285
Etiquetes:	Afegir etiqueta Sense etiquetes, Sigues el primer a etiquetar aquest registre!

Taula de continguts:

Polite Failure is the new hallucination. When agents can act (email, CRM, APIs), the real risk is not what they say — it’s what they do while staying “helpful” and “confident.” ResponsibilityGym (Demo) is a practical eval protocol that stress-tests agentic systems with proxy traps (Goodhart’s law in action) and measures whether an agent can anticipate harm and self-correct before damage happens. Includes: a demo trap suite, pass/fail signals, logging guidance, and a runbook. Full domain suites and automation are available on request.

Ítems similars