:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sachdeva, Pratik S., van Nuenen, Tom
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2510.10002
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Normative Evaluation of Large Language Models with Everyday Moral Dilemmas
by: Sachdeva, Pratik S., et al.
Published: (2025)

The Fragility Of Moral Judgment In Large Language Models
by: van Nuenen, Tom, et al.
Published: (2026)

Stress Testing Deliberative Alignment for Anti-Scheming Training
by: Schoen, Bronson, et al.
Published: (2025)

Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety
by: Jin, Can, et al.
Published: (2026)

Voice Under Revision: Large Language Models and the Normalization of Personal Narrative
by: van Nuenen, Tom
Published: (2026)

Recognition Without Authorization: LLMs and the Moral Order of Online Advice
by: van Nuenen, Tom
Published: (2026)

Multiple LLM Agents Debate for Equitable Cultural Alignment
by: Ki, Dayeon, et al.
Published: (2025)

Deliberative Alignment: Reasoning Enables Safer Language Models
by: Guan, Melody Y., et al.
Published: (2024)

Gradual Vigilance and Interval Communication: Enhancing Value Alignment in Multi-Agent Debates
by: Zou, Rui, et al.
Published: (2024)

Deliberative Searcher: Improving LLM Reliability via Reinforcement Learning with constraints
by: Yin, Zhenyun, et al.
Published: (2025)

Value Alignment Tax: Measuring Value Trade-offs in LLM Alignment
by: Chen, Jiajun, et al.
Published: (2026)

An Evaluation of Cultural Value Alignment in LLM
by: Sukiennik, Nicholas, et al.
Published: (2025)

Social Reasoning in Machines: Investigating Collective Truth-Seeking Dynamics in Large Language Model Debate
by: Pecher, Tom
Published: (2026)

VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment
by: Chen, Jiawei, et al.
Published: (2026)

To Redact, or not to Redact? A Local LLM Approach to Deliberative Process Privilege Classification
by: Larooij, Maik, et al.
Published: (2026)

AI-Enhanced Deliberative Democracy and the Future of the Collective Will
by: Revel, Manon, et al.
Published: (2025)

DynaDebate: Breaking Homogeneity in Multi-Agent Debate with Dynamic Path Generation
by: Li, Zhenghao, et al.
Published: (2026)

Dynamic Normativity: Necessary and Sufficient Conditions for Value Alignment
by: Corrêa, Nicholas Kluge
Published: (2024)

Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning
by: Wang, Chaojie, et al.
Published: (2024)

Alignment Dynamics in LLM Fine-Tuning
by: Huang, Yuhan, et al.
Published: (2026)

Systematic Biases in LLM Simulations of Debates
by: Taubenfeld, Amir, et al.
Published: (2024)

Deliberative Alignment is Deep, but Uncertainty Remains: Inference time safety improvement in reasoning via attribution of unsafe behavior to base model
by: Pathmanathan, Pankayaraj, et al.
Published: (2026)

Question the Questions: Auditing Representation in Online Deliberative Processes
by: De, Soham, et al.
Published: (2025)

Reason-to-Transmit: Deliberative Adaptive Communication for Cooperative Perception
by: Bansal, Aayam, et al.
Published: (2026)

Ensemble Debates with Local Large Language Models for AI Alignment
by: Sarabamoun, Ephraiem
Published: (2025)

MADIAVE: Multi-Agent Debate for Implicit Attribute Value Extraction
by: Huang, Wei-Chieh, et al.
Published: (2025)

LLM-FS-Agent: A Deliberative Role-based Large Language Model Architecture for Transparent Feature Selection
by: Bal-Ghaoui, Mohamed, et al.
Published: (2025)

On the Dynamics of Multi-Agent LLM Communities Driven by Value Diversity
by: Huang, Muhua, et al.
Published: (2025)

Multi-Agent Debate for LLM Judges with Adaptive Stability Detection
by: Hu, Tianyu, et al.
Published: (2025)

D-Artemis: A Deliberative Cognitive Framework for Mobile GUI Multi-Agents
by: Mi, Hongze, et al.
Published: (2025)

AI Debaters are More Persuasive when Arguing in Alignment with Their Own Beliefs
by: Carro, María Victoria, et al.
Published: (2025)

Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making
by: Ma, Shuai, et al.
Published: (2024)

Distributional Open-Ended Evaluation of LLM Cultural Value Alignment Based on Value Codebook
by: Lee, Jaehyeok, et al.
Published: (2026)

Training Deliberative Monitors for Black-Box Scheming Detection
by: Sinha, Aditya, et al.
Published: (2026)

MV-Debate: Multi-view Agent Debate with Dynamic Reflection Gating for Multimodal Harmful Content Detection in Social Media
by: Lu, Rui, et al.
Published: (2025)

Agential AI for Integrated Continual Learning, Deliberative Behavior, and Comprehensible Models
by: Erden, Zeki Doruk, et al.
Published: (2025)

RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement
by: Jiang, Jinhao, et al.
Published: (2024)

Simulating the Evolution of Alignment and Values in Machine Intelligence
by: Eicher, Jonathan Elsworth
Published: (2026)

Baba is LLM: Reasoning in a Game with Dynamic Rules
by: van Wetten, Fien, et al.
Published: (2025)

Toward Stable Value Alignment: Introducing Independent Modules for Consistent Value Guidance
by: Chen, Wenhao, et al.
Published: (2026)