Saved in:
| Main Authors: | Sachdeva, Pratik S., van Nuenen, Tom |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.10002 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Normative Evaluation of Large Language Models with Everyday Moral Dilemmas
by: Sachdeva, Pratik S., et al.
Published: (2025)
by: Sachdeva, Pratik S., et al.
Published: (2025)
The Fragility Of Moral Judgment In Large Language Models
by: van Nuenen, Tom, et al.
Published: (2026)
by: van Nuenen, Tom, et al.
Published: (2026)
Stress Testing Deliberative Alignment for Anti-Scheming Training
by: Schoen, Bronson, et al.
Published: (2025)
by: Schoen, Bronson, et al.
Published: (2025)
Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety
by: Jin, Can, et al.
Published: (2026)
by: Jin, Can, et al.
Published: (2026)
Voice Under Revision: Large Language Models and the Normalization of Personal Narrative
by: van Nuenen, Tom
Published: (2026)
by: van Nuenen, Tom
Published: (2026)
Recognition Without Authorization: LLMs and the Moral Order of Online Advice
by: van Nuenen, Tom
Published: (2026)
by: van Nuenen, Tom
Published: (2026)
Multiple LLM Agents Debate for Equitable Cultural Alignment
by: Ki, Dayeon, et al.
Published: (2025)
by: Ki, Dayeon, et al.
Published: (2025)
Deliberative Alignment: Reasoning Enables Safer Language Models
by: Guan, Melody Y., et al.
Published: (2024)
by: Guan, Melody Y., et al.
Published: (2024)
Gradual Vigilance and Interval Communication: Enhancing Value Alignment in Multi-Agent Debates
by: Zou, Rui, et al.
Published: (2024)
by: Zou, Rui, et al.
Published: (2024)
Deliberative Searcher: Improving LLM Reliability via Reinforcement Learning with constraints
by: Yin, Zhenyun, et al.
Published: (2025)
by: Yin, Zhenyun, et al.
Published: (2025)
Value Alignment Tax: Measuring Value Trade-offs in LLM Alignment
by: Chen, Jiajun, et al.
Published: (2026)
by: Chen, Jiajun, et al.
Published: (2026)
An Evaluation of Cultural Value Alignment in LLM
by: Sukiennik, Nicholas, et al.
Published: (2025)
by: Sukiennik, Nicholas, et al.
Published: (2025)
Social Reasoning in Machines: Investigating Collective Truth-Seeking Dynamics in Large Language Model Debate
by: Pecher, Tom
Published: (2026)
by: Pecher, Tom
Published: (2026)
VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment
by: Chen, Jiawei, et al.
Published: (2026)
by: Chen, Jiawei, et al.
Published: (2026)
To Redact, or not to Redact? A Local LLM Approach to Deliberative Process Privilege Classification
by: Larooij, Maik, et al.
Published: (2026)
by: Larooij, Maik, et al.
Published: (2026)
AI-Enhanced Deliberative Democracy and the Future of the Collective Will
by: Revel, Manon, et al.
Published: (2025)
by: Revel, Manon, et al.
Published: (2025)
DynaDebate: Breaking Homogeneity in Multi-Agent Debate with Dynamic Path Generation
by: Li, Zhenghao, et al.
Published: (2026)
by: Li, Zhenghao, et al.
Published: (2026)
Dynamic Normativity: Necessary and Sufficient Conditions for Value Alignment
by: Corrêa, Nicholas Kluge
Published: (2024)
by: Corrêa, Nicholas Kluge
Published: (2024)
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning
by: Wang, Chaojie, et al.
Published: (2024)
by: Wang, Chaojie, et al.
Published: (2024)
Alignment Dynamics in LLM Fine-Tuning
by: Huang, Yuhan, et al.
Published: (2026)
by: Huang, Yuhan, et al.
Published: (2026)
Systematic Biases in LLM Simulations of Debates
by: Taubenfeld, Amir, et al.
Published: (2024)
by: Taubenfeld, Amir, et al.
Published: (2024)
Deliberative Alignment is Deep, but Uncertainty Remains: Inference time safety improvement in reasoning via attribution of unsafe behavior to base model
by: Pathmanathan, Pankayaraj, et al.
Published: (2026)
by: Pathmanathan, Pankayaraj, et al.
Published: (2026)
Question the Questions: Auditing Representation in Online Deliberative Processes
by: De, Soham, et al.
Published: (2025)
by: De, Soham, et al.
Published: (2025)
Reason-to-Transmit: Deliberative Adaptive Communication for Cooperative Perception
by: Bansal, Aayam, et al.
Published: (2026)
by: Bansal, Aayam, et al.
Published: (2026)
Ensemble Debates with Local Large Language Models for AI Alignment
by: Sarabamoun, Ephraiem
Published: (2025)
by: Sarabamoun, Ephraiem
Published: (2025)
MADIAVE: Multi-Agent Debate for Implicit Attribute Value Extraction
by: Huang, Wei-Chieh, et al.
Published: (2025)
by: Huang, Wei-Chieh, et al.
Published: (2025)
LLM-FS-Agent: A Deliberative Role-based Large Language Model Architecture for Transparent Feature Selection
by: Bal-Ghaoui, Mohamed, et al.
Published: (2025)
by: Bal-Ghaoui, Mohamed, et al.
Published: (2025)
On the Dynamics of Multi-Agent LLM Communities Driven by Value Diversity
by: Huang, Muhua, et al.
Published: (2025)
by: Huang, Muhua, et al.
Published: (2025)
Multi-Agent Debate for LLM Judges with Adaptive Stability Detection
by: Hu, Tianyu, et al.
Published: (2025)
by: Hu, Tianyu, et al.
Published: (2025)
D-Artemis: A Deliberative Cognitive Framework for Mobile GUI Multi-Agents
by: Mi, Hongze, et al.
Published: (2025)
by: Mi, Hongze, et al.
Published: (2025)
AI Debaters are More Persuasive when Arguing in Alignment with Their Own Beliefs
by: Carro, María Victoria, et al.
Published: (2025)
by: Carro, María Victoria, et al.
Published: (2025)
Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making
by: Ma, Shuai, et al.
Published: (2024)
by: Ma, Shuai, et al.
Published: (2024)
Distributional Open-Ended Evaluation of LLM Cultural Value Alignment Based on Value Codebook
by: Lee, Jaehyeok, et al.
Published: (2026)
by: Lee, Jaehyeok, et al.
Published: (2026)
Training Deliberative Monitors for Black-Box Scheming Detection
by: Sinha, Aditya, et al.
Published: (2026)
by: Sinha, Aditya, et al.
Published: (2026)
MV-Debate: Multi-view Agent Debate with Dynamic Reflection Gating for Multimodal Harmful Content Detection in Social Media
by: Lu, Rui, et al.
Published: (2025)
by: Lu, Rui, et al.
Published: (2025)
Agential AI for Integrated Continual Learning, Deliberative Behavior, and Comprehensible Models
by: Erden, Zeki Doruk, et al.
Published: (2025)
by: Erden, Zeki Doruk, et al.
Published: (2025)
RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement
by: Jiang, Jinhao, et al.
Published: (2024)
by: Jiang, Jinhao, et al.
Published: (2024)
Simulating the Evolution of Alignment and Values in Machine Intelligence
by: Eicher, Jonathan Elsworth
Published: (2026)
by: Eicher, Jonathan Elsworth
Published: (2026)
Baba is LLM: Reasoning in a Game with Dynamic Rules
by: van Wetten, Fien, et al.
Published: (2025)
by: van Wetten, Fien, et al.
Published: (2025)
Toward Stable Value Alignment: Introducing Independent Modules for Consistent Value Guidance
by: Chen, Wenhao, et al.
Published: (2026)
by: Chen, Wenhao, et al.
Published: (2026)
Similar Items
-
Normative Evaluation of Large Language Models with Everyday Moral Dilemmas
by: Sachdeva, Pratik S., et al.
Published: (2025) -
The Fragility Of Moral Judgment In Large Language Models
by: van Nuenen, Tom, et al.
Published: (2026) -
Stress Testing Deliberative Alignment for Anti-Scheming Training
by: Schoen, Bronson, et al.
Published: (2025) -
Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety
by: Jin, Can, et al.
Published: (2026) -
Voice Under Revision: Large Language Models and the Normalization of Personal Narrative
by: van Nuenen, Tom
Published: (2026)