:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wiedermann-Möller, Jonas, Dung, Leonard, Andriushchenko, Maksym
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Computers and Society
Online Access:	https://arxiv.org/abs/2605.06490
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Policy-Governed LLM Routing with Intent Matching for Instrument Laboratories
by: Olowe, Emmanuel A., et al.
Published: (2026)

Against racing to AGI: Cooperation, deterrence, and catastrophic risks
by: Dung, Leonard, et al.
Published: (2025)

Misalignment or misuse? The AGI alignment tradeoff
by: Hellrigel-Holderbaum, Max, et al.
Published: (2025)

Capability-Based Scaling Trends for LLM-Based Red-Teaming
by: Panfilov, Alexander, et al.
Published: (2025)

Precautionary Governance of Autonomous AI: Legal Personhood as Functional Instrument
by: Brensing, Karsten
Published: (2026)

A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments
by: Cherep, Manuel, et al.
Published: (2025)

Does Refusal Training in LLMs Generalize to the Past Tense?
by: Andriushchenko, Maksym, et al.
Published: (2024)

QuantSightBench: Evaluating LLM Quantitative Forecasting with Prediction Intervals
by: Qin, Jeremy, et al.
Published: (2026)

AgentMisalignment: Measuring the Propensity for Misaligned Behaviour in LLM-Based Agents
by: Naik, Akshat, et al.
Published: (2025)

The Instrumental Dissolution of Typing: Why AI Challenges the Keyboard Era in Knowledge Work
by: Hua, Wei Roy
Published: (2026)

Beyond Instrumental and Substitutive Paradigms: Introducing Machine Culture as an Emergent Phenomenon in Large Language Models
by: Hu, Yueqing, et al.
Published: (2026)

Deconstructing Student Perceptions of Generative AI (GenAI) through an Expectancy Value Theory (EVT)-based Instrument
by: Chan, Cecilia Ka Yuk, et al.
Published: (2023)

LLMs as Strategic Actors: Behavioral Alignment, Risk Calibration, and Argumentation Framing in Geopolitical Simulations
by: Solopova, Veronika, et al.
Published: (2026)

The Basic B*** Effect: The Use of LLM-based Agents Reduces the Distinctiveness and Diversity of People's Choices
by: Matz, Sandra C., et al.
Published: (2025)

Are we Doomed to an AI Race? Why Self-Interest Could Drive Countries Towards a Moratorium on Superintelligence
by: Roussel, Edward, et al.
Published: (2026)

Orchestrating LLM Agents for Scientific Research: A Pilot Study of Multiple Choice Question (MCQ) Generation and Evaluation
by: An, Yuan
Published: (2026)

Scaling Behavior of Single LLM-Driven Multi-Agent Systems
by: Li, Jialing, et al.
Published: (2026)

A More Advanced Group Polarization Measurement Approach Based on LLM-Based Agents and Graphs
by: Liu, Zixin, et al.
Published: (2024)

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
by: Andriushchenko, Maksym, et al.
Published: (2024)

Evaluating the Paperclip Maximizer: Are RL-Based Language Models More Likely to Pursue Instrumental Goals?
by: He, Yufei, et al.
Published: (2025)

Improving Alignment and Robustness with Circuit Breakers
by: Zou, Andy, et al.
Published: (2024)

HalluHard: A Hard Multi-Turn Hallucination Benchmark
by: Fan, Dongyang, et al.
Published: (2026)

FutureSim: Replaying World Events to Evaluate Adaptive Agents
by: Goel, Shashwat, et al.
Published: (2026)

CAMO: An Agentic Framework for Automated Causal Discovery from Micro Behaviors to Macro Emergence in LLM Agent Simulations
by: Yu, Xiangning, et al.
Published: (2026)

A Measure for Level of Autonomy Based on Observable System Behavior
by: Pittman, Jason M.
Published: (2024)

Decomposing and Measuring Evaluation Awareness
by: Li, Changling, et al.
Published: (2026)

Regulating the Agency of LLM-based Agents
by: Boddy, Seán, et al.
Published: (2025)

Mapping Social Choice Theory to RLHF
by: Dai, Jessica, et al.
Published: (2024)

AI Agents and Hard Choices
by: Wang, Kangyu
Published: (2025)

LLM Agents in Law: Taxonomy, Applications, and Challenges
by: Liu, Shuang, et al.
Published: (2026)

Is In-Context Learning Sufficient for Instruction Following in LLMs?
by: Zhao, Hao, et al.
Published: (2024)

How Well Can LLM Agents Simulate End-User Security and Privacy Attitudes and Behaviors?
by: Li, Yuxuan, et al.
Published: (2026)

When Agents See Humans as the Outgroup: Belief-Dependent Bias in LLM-Powered Agents
by: Wang, Zongwei, et al.
Published: (2026)

Measuring Mid-2025 LLM-Assistance on Novice Performance in Biology
by: Hong, Shen Zhou, et al.
Published: (2026)

Invisible Orchestrators Suppress Protective Behavior and Dissociate Power-Holders: Safety Risks in Multi-Agent LLM Systems
by: Fukui, Hiroki
Published: (2026)

Chinese Court Simulation with LLM-Based Agent System
by: Zhang, Kaiyuan, et al.
Published: (2025)

Quantifying Risk Propensities of Large Language Models: Ethical Focus and Bias Detection through Role-Play
by: Zeng, Yifan, et al.
Published: (2024)

In-Situ Behavioral Evaluation for LLM Fairness, Not Standardized-Test Scores
by: Tang, Zeyu, et al.
Published: (2026)

Implicit Behavioral Alignment of Language Agents in High-Stakes Crowd Simulations
by: Wang, Yunzhe, et al.
Published: (2025)

Characterizing the Consistency of the Emergent Misalignment Persona
by: Weckauff, Anietta, et al.
Published: (2026)