:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Fukui, Hiroki
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Artificial Intelligence Computers and Society
Online Access:	https://arxiv.org/abs/2604.00021
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A molecular clock for writing systems reveals the quantitative impact of imperial power on cultural evolution
by: Fukui, Hiroki
Published: (2026)

"Pull or Not to Pull?'': Investigating Moral Biases in Leading Large Language Models Across Ethical Dilemmas
by: Ding, Junchen, et al.
Published: (2025)

Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction Learning
by: Duan, Shitong, et al.
Published: (2023)

Alignment Backfire: Language-Dependent Reversal of Safety Interventions Across 16 Languages in LLM Multi-Agent Systems
by: Fukui, Hiroki
Published: (2026)

Walking in Others' Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias
by: Xu, Rongwu, et al.
Published: (2024)

Semantic Consistency for Assuring Reliability of Large Language Models
by: Raj, Harsh, et al.
Published: (2023)

Alignment as Iatrogenesis: Pastoral Power, Collective Pathology, and the Structural Limits of Monolingual Safety Evaluation
by: Fukui, Hiroki
Published: (2026)

Quantifying Risk Propensities of Large Language Models: Ethical Focus and Bias Detection through Role-Play
by: Zeng, Yifan, et al.
Published: (2024)

TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution
by: Yang, Jiuding, et al.
Published: (2024)

How Large Language Models are Designed to Hallucinate
by: Ackermann, Richard, et al.
Published: (2025)

Do Large Language Models Get Caught in Hofstadter-Mobius Loops?
by: Hryszko, Jaroslaw
Published: (2026)

Large Language Models' Complicit Responses to Illicit Instructions across Socio-Legal Contexts
by: Wang, Xing, et al.
Published: (2025)

LocalValueBench: A Collaboratively Built and Extensible Benchmark for Evaluating Localized Value Alignment and Ethical Safety in Large Language Models
by: Meadows, Gwenyth Isobel, et al.
Published: (2024)

From Argumentation to Deliberation: Perspectivized Stance Vectors for Fine-grained (Dis)agreement Analysis
by: Plenz, Moritz, et al.
Published: (2025)

MedSimAI: Simulation and Formative Feedback Generation to Enhance Deliberate Practice in Medical Education
by: Hicke, Yann, et al.
Published: (2025)

Invisible Orchestrators Suppress Protective Behavior and Dissociate Power-Holders: Safety Risks in Multi-Agent LLM Systems
by: Fukui, Hiroki
Published: (2026)

Cancer Vaccine Adjuvant Name Recognition from Biomedical Literature using Large Language Models
by: Rehana, Hasin, et al.
Published: (2025)

Do GPT Language Models Suffer From Split Personality Disorder? The Advent Of Substrate-Free Psychometrics
by: Romero, Peter, et al.
Published: (2024)

Ads in AI Chatbots? An Analysis of How Large Language Models Navigate Conflicts of Interest
by: Wu, Addison J., et al.
Published: (2026)

Teaching Language Models How to Code Like Learners: Conversational Serialization for Student Simulation
by: Koutcheme, Charles, et al.
Published: (2026)

Interactive DualChecker for Mitigating Hallucinations in Distilling Large Language Models
by: Wang, Meiyun, et al.
Published: (2024)

Anecdoctoring: Automated Red-Teaming Across Language and Place
by: Cuevas, Alejandro, et al.
Published: (2025)

Self-Debiasing Large Language Models: Zero-Shot Recognition and Reduction of Stereotypes
by: Gallegos, Isabel O., et al.
Published: (2024)

Explainable Ethical Assessment on Human Behaviors by Generating Conflicting Social Norms
by: Sun, Yuxi, et al.
Published: (2025)

How Do Vision-Language Models Process Conflicting Information Across Modalities?
by: Hua, Tianze, et al.
Published: (2025)

Talking the Talk Does Not Entail Walking the Walk: On the Limits of Large Language Models in Lexical Entailment Recognition
by: Greco, Candida M., et al.
Published: (2024)

DetectAnyLLM: Towards Generalizable and Robust Detection of Machine-Generated Text Across Domains and Models
by: Fu, Jiachen, et al.
Published: (2025)

LLM Agents in Interaction: Measuring Personality Consistency and Linguistic Alignment in Interacting Populations of Large Language Models
by: Frisch, Ivar, et al.
Published: (2024)

EthicsMH: A Pilot Benchmark for Ethical Reasoning in Mental Health AI
by: Kasu, Sai Kartheek Reddy
Published: (2025)

Ethical Concern Identification in NLP: A Corpus of ACL Anthology Ethics Statements
by: Karamolegkou, Antonia, et al.
Published: (2024)

A Tale of Two Identities: An Ethical Audit of Human and AI-Crafted Personas
by: Venkit, Pranav Narayanan, et al.
Published: (2025)

Navigating Dialectal Bias and Ethical Complexities in Levantine Arabic Hate Speech Detection
by: Ahmed, Ahmed Haj, et al.
Published: (2024)

On the Creativity of Large Language Models
by: Franceschelli, Giorgio, et al.
Published: (2023)

Bridging the Copyright Gap: Do Large Vision-Language Models Recognize and Respect Copyrighted Content?
by: Xu, Naen, et al.
Published: (2025)

Cross-Language Bias Examination in Large Language Models
by: Liang, Yuxuan, et al.
Published: (2025)

Do Language Models Reason Across Languages?
by: Meng, Yan, et al.
Published: (2026)

"Sorry, I Didn't Catch That": How Speech Models Miss What Matters Most
by: Zhou, Kaitlyn, et al.
Published: (2026)

The Moral Consistency Pipeline: Continuous Ethical Evaluation for Large Language Models
by: Jamshidi, Saeid, et al.
Published: (2025)

Anticipating Innovation Using Large Language Models
by: Fenoaltea, Enrico Maria, et al.
Published: (2026)

Evaluating Large Language Models for Detecting Antisemitism
by: Patel, Jay, et al.
Published: (2025)