:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kyrychenko, Yara, Zhou, Ke, Bogucka, Edyta, Quercia, Daniele
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2502.15861
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Frictionless Love: Associations Between AI Companion Roles and Behavioral Addiction
by: Agarwal, Vibhor, et al.
Published: (2026)

Why AI Harms Can't Be Fixed One Identity at a Time: What 5300 Incident Reports Reveal About Intersectionality
by: Bogucka, Edyta, et al.
Published: (2026)

Agent-Supported Foresight for AI Systemic Risks: AI Agents for Breadth, Experts for Judgment
by: Fröhling, Leon, et al.
Published: (2026)

Co-designing an AI Impact Assessment Report Template with AI Practitioners and AI Compliance Experts
by: Bogucka, Edyta, et al.
Published: (2024)

The Hall of AI Fears and Hopes: Comparing the Views of AI Influencers and those of Members of the U.S. Public Through an Interactive Platform
by: Moreira, Gustavo, et al.
Published: (2025)

Atlas of AI Risks: Enhancing Public Understanding of AI Risks
by: Bogucka, Edyta, et al.
Published: (2025)

Evaluating the role of `Constitutions' for learning from AI feedback
by: Redgate, Saskia, et al.
Published: (2024)

Inverse Constitutional AI: Compressing Preferences into Principles
by: Findeis, Arduin, et al.
Published: (2024)

Should LLMs be WEIRD? Exploring WEIRDness and Human Rights in Large Language Models
by: Zhou, Ke, et al.
Published: (2025)

Improving Labeling Consistency with Detailed Constitutional Definitions and AI-Driven Evaluation
by: Berlin, Konstantin, et al.
Published: (2026)

ExploreGen: Large Language Models for Envisioning the Uses and Risks of AI Technologies
by: Herdel, Viviane, et al.
Published: (2024)

Good Intentions, Risky Inventions: A Method for Assessing the Risks and Benefits of AI in Mobile and Wearable Uses
by: Constantinides, Marios, et al.
Published: (2024)

Open Character Training: Shaping the Persona of AI Assistants through Constitutional AI
by: Maiya, Sharan, et al.
Published: (2025)

Impact Assessment Card: Communicating Risks and Benefits of AI Uses
by: Bogucka, Edyta, et al.
Published: (2025)

Reverse Constitutional AI: A Framework for Controllable Toxic Data Generation via Probability-Clamped RLAIF
by: Fang, Yuan, et al.
Published: (2026)

RiskRAG: A Data-Driven Solution for Improved AI Model Risk Reporting
by: Rao, Pooja S. B., et al.
Published: (2025)

NLPGuard: A Framework for Mitigating the Use of Protected Attributes by NLP Classifiers
by: Greco, Salvatore, et al.
Published: (2024)

Public Constitutional AI
by: Abiri, Gilad
Published: (2024)

RAI Guidelines: Method for Generating Responsible AI Guidelines Grounded in Regulations and Usable by (Non-)Technical Roles
by: Constantinides, Marios, et al.
Published: (2023)

Does Claude's Constitution Have a Culture?
by: Pourdavood, Parham
Published: (2026)

Epistemic Constitutionalism Or: how to avoid coherence bias
by: Loi, Michele
Published: (2026)

Collective Constitutional AI: Aligning a Language Model with Public Input
by: Huang, Saffron, et al.
Published: (2024)

Human Preferences for Constructive Interactions in Language Model Alignment
by: Kyrychenko, Yara, et al.
Published: (2025)

ConstitutionalExperts: Training a Mixture of Principle-based Prompts
by: Petridis, Savvas, et al.
Published: (2024)

MAC: Multi-Agent Constitution Learning
by: Thareja, Rushil, et al.
Published: (2026)

Evaluating GPT-3.5's Awareness and Summarization Abilities for European Constitutional Texts with Shared Topics
by: Greco, Candida M., et al.
Published: (2024)

A Tale of Two Identities: An Ethical Audit of Human and AI-Crafted Personas
by: Venkit, Pranav Narayanan, et al.
Published: (2025)

AI Space Physics: Constitutive boundary semantics for open AI institutions
by: Romanchuk, Oleg, et al.
Published: (2026)

Trojan-Speak: Bypassing Constitutional Classifiers with No Jailbreak Tax via Adversarial Finetuning
by: Sel, Bilgehan, et al.
Published: (2026)

The Atlas of AI Incidents in Mobile Computing: Visualizing the Risks and Benefits of AI Gone Mobile
by: Bogucka, Edyta, et al.
Published: (2024)

Compiling Prompts, Not Crafting Them: A Reproducible Workflow for AI-Assisted Evidence Synthesis
by: Susnjak, Teo
Published: (2025)

Interpreting and Controlling Model Behavior via Constitutions for Atomic Concept Edits
by: Kalibhat, Neha, et al.
Published: (2026)

Addressing Climate Action Misperceptions with Generative AI
by: Remshard, Miriam, et al.
Published: (2026)

Constitutional Law and AI Governance: Constraints on Model Licensing and Research Classification
by: Mark, Alex, et al.
Published: (2025)

Crafting Hanzi as Narrative Bridges: An AI Co-Creation Workshop for Elderly Migrants
by: Zhan, Wen, et al.
Published: (2025)

Is Decentralized AI Governable? From Regulative Policy to Constitutive Protocol
by: Hu, Botao Amber, et al.
Published: (2026)

Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming
by: Sharma, Mrinank, et al.
Published: (2025)

The AI Model Risk Catalog: What Developers and Researchers Miss About Real-World AI Harms
by: Rao, Pooja S. B., et al.
Published: (2025)

How malicious AI swarms can threaten democracy: The fusion of agentic AI and LLMs marks a new frontier in information warfare
by: Schroeder, Daniel Thilo, et al.
Published: (2025)

Giving Voice to the Constitution: Low-Resource Text-to-Speech for Quechua and Spanish Using a Bilingual Legal Corpus
by: Ortega, John E., et al.
Published: (2026)