:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kim, Jiseon, Kwon, Jea, Vecchietti, Luiz Felipe, Oh, Alice, Cha, Meeyoung
Format:	Preprint
Published:	2025
Subjects:	Computers and Society Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2504.10886
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Dropouts in Confidence: Moral Uncertainty in Human-LLM Alignment
by: Kwon, Jea, et al.
Published: (2025)

Machine Behavior in Relational Moral Dilemmas: Moral Rightness, Predicted Human Behavior, and Model Decisions
by: Kim, Jiseon, et al.
Published: (2026)

How Training Data Shapes the Use of Parametric and In-Context Knowledge in Language Models
by: Kim, Minsung, et al.
Published: (2025)

Uncovering Factor Level Preferences to Improve Human-Model Alignment
by: Oh, Juhyun, et al.
Published: (2024)

Decoding Multilingual Moral Preferences: Unveiling LLM's Biases Through the Moral Machine Experiment
by: Vida, Karina, et al.
Published: (2024)

Moral Susceptibility and Robustness under Persona Role-Play in Large Language Models
by: Costa, Davi Bastos, et al.
Published: (2025)

Social Catalysts, Not Moral Agents: The Illusion of Alignment in LLM Societies
by: Hu, Yueqing, et al.
Published: (2026)

Training-Free Cultural Alignment of Large Language Models via Persona Disagreement
by: Kiet, Huynh Trung, et al.
Published: (2026)

German General Social Survey Personas: A Survey-Derived Persona Prompt Collection for Population-Aligned LLM Studies
by: Rupprecht, Jens, et al.
Published: (2025)

RoleConflictBench: A Benchmark of Role Conflict Scenarios for Evaluating LLMs' Contextual Sensitivity
by: Shin, Jisu, et al.
Published: (2025)

KoBBQ: Korean Bias Benchmark for Question Answering
by: Jin, Jiho, et al.
Published: (2023)

Societal Alignment Frameworks Can Improve LLM Alignment
by: Stańczak, Karolina, et al.
Published: (2025)

Culturally Grounded Personas in Large Language Models: Characterization and Alignment with Socio-Psychological Value Frameworks
by: Greco, Candida M., et al.
Published: (2026)

Scaling Law in LLM Simulated Personality: More Detailed and Realistic Persona Profile Is All You Need
by: Bai, Yuqi, et al.
Published: (2025)

When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas
by: Backmann, Steffen, et al.
Published: (2025)

The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate
by: Oh, Juhyun, et al.
Published: (2024)

LLM Generated Persona is a Promise with a Catch
by: Li, Ang, et al.
Published: (2025)

Does Cross-Cultural Alignment Change the Commonsense Morality of Language Models?
by: Jinnai, Yuu
Published: (2024)

Synthetic Reader Panels: Tournament-Based Ideation with LLM Personas for Autonomous Publishing
by: Zimmerman, Fred
Published: (2026)

PerMix-RLVR: Preserving Persona Expressivity under Verifiable-Reward Alignment
by: Oh, Jihwan, et al.
Published: (2026)

The Need for a Socially-Grounded Persona Framework for User Simulation
by: Venkit, Pranav Narayanan, et al.
Published: (2026)

Steering at the Source: Style Modulation Heads for Robust Persona Control
by: Izawa, Yoshihiro, et al.
Published: (2026)

Rethinking Test-Time Scaling for Medical AI: Model and Task-Aware Strategies for LLMs and VLMs
by: Oh, Gyutaek, et al.
Published: (2025)

LLM Agents Predict Social Media Reactions but Do Not Outperform Text Classifiers: Benchmarking Simulation Accuracy Using 120K+ Personas of 1511 Humans
by: Bojic, Ljubisa, et al.
Published: (2026)

ProgressGym: Alignment with a Millennium of Moral Progress
by: Qiu, Tianyi, et al.
Published: (2024)

Between Rules and Reality: On the Context Sensitivity of LLM Moral Judgment
by: Sauter, Adrian, et al.
Published: (2026)

Moral Mazes in the Era of LLMs
by: Nguyen, Dang, et al.
Published: (2026)

Spotting Out-of-Character Behavior: Atomic-Level Evaluation of Persona Fidelity in Open-Ended Generation
by: Shin, Jisu, et al.
Published: (2025)

From Descriptive to Prescriptive: Uncover the Social Value Alignment of LLM-based Agents
by: Qu, Jinxian, et al.
Published: (2026)

Machine Learning for Detection and Analysis of Novel LLM Jailbreaks
by: Hawkins, John, et al.
Published: (2025)

A Tale of Two Identities: An Ethical Audit of Human and AI-Crafted Personas
by: Venkit, Pranav Narayanan, et al.
Published: (2025)

Empirical Evidence for Alignment Faking in a Small LLM and Prompt-Based Mitigation Techniques
by: Koorndijk, Jeanice
Published: (2025)

EconCausal: A Context-Aware Economic Reasoning Benchmark for Large Language Models
by: Lee, Donggyu, et al.
Published: (2025)

Moral Outrage Shapes Commitments Beyond Attention: Multimodal Moral Emotions on YouTube in Korea and the US
by: Park, Seongchan, et al.
Published: (2026)

Embracing Dialectic Intersubjectivity: Coordination of Different Perspectives in Content Analysis with LLM Persona Simulation
by: Kang, Taewoo, et al.
Published: (2025)

MORALISE: A Structured Benchmark for Moral Alignment in Visual Language Models
by: Lin, Xiao, et al.
Published: (2025)

Moral Alignment for LLM Agents
by: Tennant, Elizaveta, et al.
Published: (2024)

Chat Bankman-Fried: an Exploration of LLM Alignment in Finance
by: Biancotti, Claudia, et al.
Published: (2024)

Widespread Gender and Pronoun Bias in Moral Judgments Across LLMs
by: Fernandes, Gustavo Lúcius, et al.
Published: (2026)

Scopes of Alignment
by: Varshney, Kush R., et al.
Published: (2025)