Saved in:
| Main Authors: | Weller, Niklas, Barkett, Emilio |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.25256 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The Compulsory Imaginary: AGI and Corporate Authority
by: Barkett, Emilio
Published: (2026)
by: Barkett, Emilio
Published: (2026)
Status Hierarchies in Language Models
by: Barkett, Emilio
Published: (2026)
by: Barkett, Emilio
Published: (2026)
Don't Change My View: Ideological Bias Auditing in Large Language Models
by: Kröger, Paul, et al.
Published: (2025)
by: Kröger, Paul, et al.
Published: (2025)
Representation Without Control: Testing the Realization Effect in Language Models
by: Walsh, Ciarán, et al.
Published: (2026)
by: Walsh, Ciarán, et al.
Published: (2026)
Reasoning Isn't Enough: Examining Truth-Bias and Sycophancy in LLMs
by: Barkett, Emilio, et al.
Published: (2025)
by: Barkett, Emilio, et al.
Published: (2025)
Getting out of the Big-Muddy: Escalation of Commitment in LLMs
by: Barkett, Emilio, et al.
Published: (2025)
by: Barkett, Emilio, et al.
Published: (2025)
Whose Truth? Pluralistic Geo-Alignment for (Agentic) AI
by: Janowicz, Krzysztof, et al.
Published: (2025)
by: Janowicz, Krzysztof, et al.
Published: (2025)
Humanline: Online Alignment as Perceptual Loss
by: Liu, Sijia, et al.
Published: (2025)
by: Liu, Sijia, et al.
Published: (2025)
How Far Can In-Context Alignment Go? Exploring the State of In-Context Alignment
by: Huang, Heyan, et al.
Published: (2024)
by: Huang, Heyan, et al.
Published: (2024)
Characterizing Linear Alignment Across Language Models
by: Gorbett, Matt, et al.
Published: (2026)
by: Gorbett, Matt, et al.
Published: (2026)
Improving Behavioral Alignment in LLM Social Simulations via Context Formation and Navigation
by: Kong, Letian, et al.
Published: (2026)
by: Kong, Letian, et al.
Published: (2026)
A Context Alignment Pre-processor for Enhancing the Coherence of Human-LLM Dialog
by: Wei, Ding
Published: (2026)
by: Wei, Ding
Published: (2026)
Can AI Make Conflicts Worse? An Alignment Failure in LLM Deployment Across Conflict Contexts
by: Kryshtal, Andrii
Published: (2026)
by: Kryshtal, Andrii
Published: (2026)
Measuring Error Alignment for Decision-Making Systems
by: Xu, Binxia, et al.
Published: (2024)
by: Xu, Binxia, et al.
Published: (2024)
Interactive AI Alignment: Specification, Process, and Evaluation Alignment
by: Terry, Michael, et al.
Published: (2023)
by: Terry, Michael, et al.
Published: (2023)
XChoice: Explainable Evaluation of AI-Human Alignment in LLM-based Constrained Choice Decision Making
by: Qi, Weihong, et al.
Published: (2026)
by: Qi, Weihong, et al.
Published: (2026)
Whose View of Safety? A Deep DIVE Dataset for Pluralistic Alignment of Text-to-Image Models
by: Rastogi, Charvi, et al.
Published: (2025)
by: Rastogi, Charvi, et al.
Published: (2025)
The Moral Turing Test: Evaluating Human-LLM Alignment in Moral Decision-Making
by: Garcia, Basile, et al.
Published: (2024)
by: Garcia, Basile, et al.
Published: (2024)
Societal Alignment Frameworks Can Improve LLM Alignment
by: Stańczak, Karolina, et al.
Published: (2025)
by: Stańczak, Karolina, et al.
Published: (2025)
MoralReason: Generalizable Moral Decision Alignment For LLM Agents Using Reasoning-Level Reinforcement Learning
by: An, Zhiyu, et al.
Published: (2025)
by: An, Zhiyu, et al.
Published: (2025)
Comparative Analysis of 47 Context-Based Question Answer Models Across 8 Diverse Datasets
by: Muneeb, Muhammad, et al.
Published: (2025)
by: Muneeb, Muhammad, et al.
Published: (2025)
KTO: Model Alignment as Prospect Theoretic Optimization
by: Ethayarajh, Kawin, et al.
Published: (2024)
by: Ethayarajh, Kawin, et al.
Published: (2024)
SafeWorld: Geo-Diverse Safety Alignment
by: Yin, Da, et al.
Published: (2024)
by: Yin, Da, et al.
Published: (2024)
Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes
by: Kobalczyk, Katarzyna, et al.
Published: (2024)
by: Kobalczyk, Katarzyna, et al.
Published: (2024)
Human-Alignment Influences the Utility of AI-assisted Decision Making
by: Benz, Nina L. Corvelo, et al.
Published: (2025)
by: Benz, Nina L. Corvelo, et al.
Published: (2025)
ALIGN: Prompt-based Attribute Alignment for Reliable, Responsible, and Personalized LLM-based Decision-Making
by: Ravichandran, Bharadwaj, et al.
Published: (2025)
by: Ravichandran, Bharadwaj, et al.
Published: (2025)
Position: Capability Control Should be a Separate Goal From Alignment
by: Siddiqui, Shoaib Ahmed, et al.
Published: (2026)
by: Siddiqui, Shoaib Ahmed, et al.
Published: (2026)
Value Alignment Tax: Measuring Value Trade-offs in LLM Alignment
by: Chen, Jiajun, et al.
Published: (2026)
by: Chen, Jiajun, et al.
Published: (2026)
Conformal Feedback Alignment: Quantifying Answer-Level Reliability for Robust LLM Alignment
by: Chen, Tiejin, et al.
Published: (2026)
by: Chen, Tiejin, et al.
Published: (2026)
Deliberative Dynamics and Value Alignment in LLM Debates
by: Sachdeva, Pratik S., et al.
Published: (2025)
by: Sachdeva, Pratik S., et al.
Published: (2025)
Alignment Dynamics in LLM Fine-Tuning
by: Huang, Yuhan, et al.
Published: (2026)
by: Huang, Yuhan, et al.
Published: (2026)
Tokenized Bandit for LLM Decoding and Alignment
by: Shin, Suho, et al.
Published: (2025)
by: Shin, Suho, et al.
Published: (2025)
An Evaluation of Cultural Value Alignment in LLM
by: Sukiennik, Nicholas, et al.
Published: (2025)
by: Sukiennik, Nicholas, et al.
Published: (2025)
Understanding Layer Significance in LLM Alignment
by: Shi, Guangyuan, et al.
Published: (2024)
by: Shi, Guangyuan, et al.
Published: (2024)
Moral Alignment for LLM Agents
by: Tennant, Elizaveta, et al.
Published: (2024)
by: Tennant, Elizaveta, et al.
Published: (2024)
Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails
by: Han, Siwei, et al.
Published: (2025)
by: Han, Siwei, et al.
Published: (2025)
Four-Axis Decision Alignment for Long-Horizon Enterprise AI Agents
by: Srininvasan, Vasundra
Published: (2026)
by: Srininvasan, Vasundra
Published: (2026)
COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs
by: Choi, Dasol, et al.
Published: (2026)
by: Choi, Dasol, et al.
Published: (2026)
GATEAU: Selecting Influential Samples for Long Context Alignment
by: Si, Shuzheng, et al.
Published: (2024)
by: Si, Shuzheng, et al.
Published: (2024)
LLM Active Alignment: A Nash Equilibrium Perspective
by: Wang, Tonghan, et al.
Published: (2026)
by: Wang, Tonghan, et al.
Published: (2026)
Similar Items
-
The Compulsory Imaginary: AGI and Corporate Authority
by: Barkett, Emilio
Published: (2026) -
Status Hierarchies in Language Models
by: Barkett, Emilio
Published: (2026) -
Don't Change My View: Ideological Bias Auditing in Large Language Models
by: Kröger, Paul, et al.
Published: (2025) -
Representation Without Control: Testing the Realization Effect in Language Models
by: Walsh, Ciarán, et al.
Published: (2026) -
Reasoning Isn't Enough: Examining Truth-Bias and Sycophancy in LLMs
by: Barkett, Emilio, et al.
Published: (2025)