Saved in:
| Main Authors: | Kyrychenko, Yara, Zhou, Ke, Bogucka, Edyta, Quercia, Daniele |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.15861 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Frictionless Love: Associations Between AI Companion Roles and Behavioral Addiction
by: Agarwal, Vibhor, et al.
Published: (2026)
by: Agarwal, Vibhor, et al.
Published: (2026)
Why AI Harms Can't Be Fixed One Identity at a Time: What 5300 Incident Reports Reveal About Intersectionality
by: Bogucka, Edyta, et al.
Published: (2026)
by: Bogucka, Edyta, et al.
Published: (2026)
Agent-Supported Foresight for AI Systemic Risks: AI Agents for Breadth, Experts for Judgment
by: Fröhling, Leon, et al.
Published: (2026)
by: Fröhling, Leon, et al.
Published: (2026)
Co-designing an AI Impact Assessment Report Template with AI Practitioners and AI Compliance Experts
by: Bogucka, Edyta, et al.
Published: (2024)
by: Bogucka, Edyta, et al.
Published: (2024)
The Hall of AI Fears and Hopes: Comparing the Views of AI Influencers and those of Members of the U.S. Public Through an Interactive Platform
by: Moreira, Gustavo, et al.
Published: (2025)
by: Moreira, Gustavo, et al.
Published: (2025)
Atlas of AI Risks: Enhancing Public Understanding of AI Risks
by: Bogucka, Edyta, et al.
Published: (2025)
by: Bogucka, Edyta, et al.
Published: (2025)
Evaluating the role of `Constitutions' for learning from AI feedback
by: Redgate, Saskia, et al.
Published: (2024)
by: Redgate, Saskia, et al.
Published: (2024)
Inverse Constitutional AI: Compressing Preferences into Principles
by: Findeis, Arduin, et al.
Published: (2024)
by: Findeis, Arduin, et al.
Published: (2024)
Should LLMs be WEIRD? Exploring WEIRDness and Human Rights in Large Language Models
by: Zhou, Ke, et al.
Published: (2025)
by: Zhou, Ke, et al.
Published: (2025)
Improving Labeling Consistency with Detailed Constitutional Definitions and AI-Driven Evaluation
by: Berlin, Konstantin, et al.
Published: (2026)
by: Berlin, Konstantin, et al.
Published: (2026)
ExploreGen: Large Language Models for Envisioning the Uses and Risks of AI Technologies
by: Herdel, Viviane, et al.
Published: (2024)
by: Herdel, Viviane, et al.
Published: (2024)
Good Intentions, Risky Inventions: A Method for Assessing the Risks and Benefits of AI in Mobile and Wearable Uses
by: Constantinides, Marios, et al.
Published: (2024)
by: Constantinides, Marios, et al.
Published: (2024)
Open Character Training: Shaping the Persona of AI Assistants through Constitutional AI
by: Maiya, Sharan, et al.
Published: (2025)
by: Maiya, Sharan, et al.
Published: (2025)
Impact Assessment Card: Communicating Risks and Benefits of AI Uses
by: Bogucka, Edyta, et al.
Published: (2025)
by: Bogucka, Edyta, et al.
Published: (2025)
Reverse Constitutional AI: A Framework for Controllable Toxic Data Generation via Probability-Clamped RLAIF
by: Fang, Yuan, et al.
Published: (2026)
by: Fang, Yuan, et al.
Published: (2026)
RiskRAG: A Data-Driven Solution for Improved AI Model Risk Reporting
by: Rao, Pooja S. B., et al.
Published: (2025)
by: Rao, Pooja S. B., et al.
Published: (2025)
NLPGuard: A Framework for Mitigating the Use of Protected Attributes by NLP Classifiers
by: Greco, Salvatore, et al.
Published: (2024)
by: Greco, Salvatore, et al.
Published: (2024)
Public Constitutional AI
by: Abiri, Gilad
Published: (2024)
by: Abiri, Gilad
Published: (2024)
RAI Guidelines: Method for Generating Responsible AI Guidelines Grounded in Regulations and Usable by (Non-)Technical Roles
by: Constantinides, Marios, et al.
Published: (2023)
by: Constantinides, Marios, et al.
Published: (2023)
Does Claude's Constitution Have a Culture?
by: Pourdavood, Parham
Published: (2026)
by: Pourdavood, Parham
Published: (2026)
Epistemic Constitutionalism Or: how to avoid coherence bias
by: Loi, Michele
Published: (2026)
by: Loi, Michele
Published: (2026)
Collective Constitutional AI: Aligning a Language Model with Public Input
by: Huang, Saffron, et al.
Published: (2024)
by: Huang, Saffron, et al.
Published: (2024)
Human Preferences for Constructive Interactions in Language Model Alignment
by: Kyrychenko, Yara, et al.
Published: (2025)
by: Kyrychenko, Yara, et al.
Published: (2025)
ConstitutionalExperts: Training a Mixture of Principle-based Prompts
by: Petridis, Savvas, et al.
Published: (2024)
by: Petridis, Savvas, et al.
Published: (2024)
MAC: Multi-Agent Constitution Learning
by: Thareja, Rushil, et al.
Published: (2026)
by: Thareja, Rushil, et al.
Published: (2026)
Evaluating GPT-3.5's Awareness and Summarization Abilities for European Constitutional Texts with Shared Topics
by: Greco, Candida M., et al.
Published: (2024)
by: Greco, Candida M., et al.
Published: (2024)
A Tale of Two Identities: An Ethical Audit of Human and AI-Crafted Personas
by: Venkit, Pranav Narayanan, et al.
Published: (2025)
by: Venkit, Pranav Narayanan, et al.
Published: (2025)
AI Space Physics: Constitutive boundary semantics for open AI institutions
by: Romanchuk, Oleg, et al.
Published: (2026)
by: Romanchuk, Oleg, et al.
Published: (2026)
Trojan-Speak: Bypassing Constitutional Classifiers with No Jailbreak Tax via Adversarial Finetuning
by: Sel, Bilgehan, et al.
Published: (2026)
by: Sel, Bilgehan, et al.
Published: (2026)
The Atlas of AI Incidents in Mobile Computing: Visualizing the Risks and Benefits of AI Gone Mobile
by: Bogucka, Edyta, et al.
Published: (2024)
by: Bogucka, Edyta, et al.
Published: (2024)
Compiling Prompts, Not Crafting Them: A Reproducible Workflow for AI-Assisted Evidence Synthesis
by: Susnjak, Teo
Published: (2025)
by: Susnjak, Teo
Published: (2025)
Interpreting and Controlling Model Behavior via Constitutions for Atomic Concept Edits
by: Kalibhat, Neha, et al.
Published: (2026)
by: Kalibhat, Neha, et al.
Published: (2026)
Addressing Climate Action Misperceptions with Generative AI
by: Remshard, Miriam, et al.
Published: (2026)
by: Remshard, Miriam, et al.
Published: (2026)
Constitutional Law and AI Governance: Constraints on Model Licensing and Research Classification
by: Mark, Alex, et al.
Published: (2025)
by: Mark, Alex, et al.
Published: (2025)
Crafting Hanzi as Narrative Bridges: An AI Co-Creation Workshop for Elderly Migrants
by: Zhan, Wen, et al.
Published: (2025)
by: Zhan, Wen, et al.
Published: (2025)
Is Decentralized AI Governable? From Regulative Policy to Constitutive Protocol
by: Hu, Botao Amber, et al.
Published: (2026)
by: Hu, Botao Amber, et al.
Published: (2026)
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming
by: Sharma, Mrinank, et al.
Published: (2025)
by: Sharma, Mrinank, et al.
Published: (2025)
The AI Model Risk Catalog: What Developers and Researchers Miss About Real-World AI Harms
by: Rao, Pooja S. B., et al.
Published: (2025)
by: Rao, Pooja S. B., et al.
Published: (2025)
How malicious AI swarms can threaten democracy: The fusion of agentic AI and LLMs marks a new frontier in information warfare
by: Schroeder, Daniel Thilo, et al.
Published: (2025)
by: Schroeder, Daniel Thilo, et al.
Published: (2025)
Giving Voice to the Constitution: Low-Resource Text-to-Speech for Quechua and Spanish Using a Bilingual Legal Corpus
by: Ortega, John E., et al.
Published: (2026)
by: Ortega, John E., et al.
Published: (2026)
Similar Items
-
Frictionless Love: Associations Between AI Companion Roles and Behavioral Addiction
by: Agarwal, Vibhor, et al.
Published: (2026) -
Why AI Harms Can't Be Fixed One Identity at a Time: What 5300 Incident Reports Reveal About Intersectionality
by: Bogucka, Edyta, et al.
Published: (2026) -
Agent-Supported Foresight for AI Systemic Risks: AI Agents for Breadth, Experts for Judgment
by: Fröhling, Leon, et al.
Published: (2026) -
Co-designing an AI Impact Assessment Report Template with AI Practitioners and AI Compliance Experts
by: Bogucka, Edyta, et al.
Published: (2024) -
The Hall of AI Fears and Hopes: Comparing the Views of AI Influencers and those of Members of the U.S. Public Through an Interactive Platform
by: Moreira, Gustavo, et al.
Published: (2025)