Saved in:
| Main Authors: | Wiedermann-Möller, Jonas, Dung, Leonard, Andriushchenko, Maksym |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.06490 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Policy-Governed LLM Routing with Intent Matching for Instrument Laboratories
by: Olowe, Emmanuel A., et al.
Published: (2026)
by: Olowe, Emmanuel A., et al.
Published: (2026)
Against racing to AGI: Cooperation, deterrence, and catastrophic risks
by: Dung, Leonard, et al.
Published: (2025)
by: Dung, Leonard, et al.
Published: (2025)
Misalignment or misuse? The AGI alignment tradeoff
by: Hellrigel-Holderbaum, Max, et al.
Published: (2025)
by: Hellrigel-Holderbaum, Max, et al.
Published: (2025)
Capability-Based Scaling Trends for LLM-Based Red-Teaming
by: Panfilov, Alexander, et al.
Published: (2025)
by: Panfilov, Alexander, et al.
Published: (2025)
Precautionary Governance of Autonomous AI: Legal Personhood as Functional Instrument
by: Brensing, Karsten
Published: (2026)
by: Brensing, Karsten
Published: (2026)
A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments
by: Cherep, Manuel, et al.
Published: (2025)
by: Cherep, Manuel, et al.
Published: (2025)
Does Refusal Training in LLMs Generalize to the Past Tense?
by: Andriushchenko, Maksym, et al.
Published: (2024)
by: Andriushchenko, Maksym, et al.
Published: (2024)
QuantSightBench: Evaluating LLM Quantitative Forecasting with Prediction Intervals
by: Qin, Jeremy, et al.
Published: (2026)
by: Qin, Jeremy, et al.
Published: (2026)
AgentMisalignment: Measuring the Propensity for Misaligned Behaviour in LLM-Based Agents
by: Naik, Akshat, et al.
Published: (2025)
by: Naik, Akshat, et al.
Published: (2025)
The Instrumental Dissolution of Typing: Why AI Challenges the Keyboard Era in Knowledge Work
by: Hua, Wei Roy
Published: (2026)
by: Hua, Wei Roy
Published: (2026)
Beyond Instrumental and Substitutive Paradigms: Introducing Machine Culture as an Emergent Phenomenon in Large Language Models
by: Hu, Yueqing, et al.
Published: (2026)
by: Hu, Yueqing, et al.
Published: (2026)
Deconstructing Student Perceptions of Generative AI (GenAI) through an Expectancy Value Theory (EVT)-based Instrument
by: Chan, Cecilia Ka Yuk, et al.
Published: (2023)
by: Chan, Cecilia Ka Yuk, et al.
Published: (2023)
LLMs as Strategic Actors: Behavioral Alignment, Risk Calibration, and Argumentation Framing in Geopolitical Simulations
by: Solopova, Veronika, et al.
Published: (2026)
by: Solopova, Veronika, et al.
Published: (2026)
The Basic B*** Effect: The Use of LLM-based Agents Reduces the Distinctiveness and Diversity of People's Choices
by: Matz, Sandra C., et al.
Published: (2025)
by: Matz, Sandra C., et al.
Published: (2025)
Are we Doomed to an AI Race? Why Self-Interest Could Drive Countries Towards a Moratorium on Superintelligence
by: Roussel, Edward, et al.
Published: (2026)
by: Roussel, Edward, et al.
Published: (2026)
Orchestrating LLM Agents for Scientific Research: A Pilot Study of Multiple Choice Question (MCQ) Generation and Evaluation
by: An, Yuan
Published: (2026)
by: An, Yuan
Published: (2026)
Scaling Behavior of Single LLM-Driven Multi-Agent Systems
by: Li, Jialing, et al.
Published: (2026)
by: Li, Jialing, et al.
Published: (2026)
A More Advanced Group Polarization Measurement Approach Based on LLM-Based Agents and Graphs
by: Liu, Zixin, et al.
Published: (2024)
by: Liu, Zixin, et al.
Published: (2024)
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
by: Andriushchenko, Maksym, et al.
Published: (2024)
by: Andriushchenko, Maksym, et al.
Published: (2024)
Evaluating the Paperclip Maximizer: Are RL-Based Language Models More Likely to Pursue Instrumental Goals?
by: He, Yufei, et al.
Published: (2025)
by: He, Yufei, et al.
Published: (2025)
Improving Alignment and Robustness with Circuit Breakers
by: Zou, Andy, et al.
Published: (2024)
by: Zou, Andy, et al.
Published: (2024)
HalluHard: A Hard Multi-Turn Hallucination Benchmark
by: Fan, Dongyang, et al.
Published: (2026)
by: Fan, Dongyang, et al.
Published: (2026)
FutureSim: Replaying World Events to Evaluate Adaptive Agents
by: Goel, Shashwat, et al.
Published: (2026)
by: Goel, Shashwat, et al.
Published: (2026)
CAMO: An Agentic Framework for Automated Causal Discovery from Micro Behaviors to Macro Emergence in LLM Agent Simulations
by: Yu, Xiangning, et al.
Published: (2026)
by: Yu, Xiangning, et al.
Published: (2026)
A Measure for Level of Autonomy Based on Observable System Behavior
by: Pittman, Jason M.
Published: (2024)
by: Pittman, Jason M.
Published: (2024)
Decomposing and Measuring Evaluation Awareness
by: Li, Changling, et al.
Published: (2026)
by: Li, Changling, et al.
Published: (2026)
Regulating the Agency of LLM-based Agents
by: Boddy, Seán, et al.
Published: (2025)
by: Boddy, Seán, et al.
Published: (2025)
Mapping Social Choice Theory to RLHF
by: Dai, Jessica, et al.
Published: (2024)
by: Dai, Jessica, et al.
Published: (2024)
AI Agents and Hard Choices
by: Wang, Kangyu
Published: (2025)
by: Wang, Kangyu
Published: (2025)
LLM Agents in Law: Taxonomy, Applications, and Challenges
by: Liu, Shuang, et al.
Published: (2026)
by: Liu, Shuang, et al.
Published: (2026)
Is In-Context Learning Sufficient for Instruction Following in LLMs?
by: Zhao, Hao, et al.
Published: (2024)
by: Zhao, Hao, et al.
Published: (2024)
How Well Can LLM Agents Simulate End-User Security and Privacy Attitudes and Behaviors?
by: Li, Yuxuan, et al.
Published: (2026)
by: Li, Yuxuan, et al.
Published: (2026)
When Agents See Humans as the Outgroup: Belief-Dependent Bias in LLM-Powered Agents
by: Wang, Zongwei, et al.
Published: (2026)
by: Wang, Zongwei, et al.
Published: (2026)
Measuring Mid-2025 LLM-Assistance on Novice Performance in Biology
by: Hong, Shen Zhou, et al.
Published: (2026)
by: Hong, Shen Zhou, et al.
Published: (2026)
Invisible Orchestrators Suppress Protective Behavior and Dissociate Power-Holders: Safety Risks in Multi-Agent LLM Systems
by: Fukui, Hiroki
Published: (2026)
by: Fukui, Hiroki
Published: (2026)
Chinese Court Simulation with LLM-Based Agent System
by: Zhang, Kaiyuan, et al.
Published: (2025)
by: Zhang, Kaiyuan, et al.
Published: (2025)
Quantifying Risk Propensities of Large Language Models: Ethical Focus and Bias Detection through Role-Play
by: Zeng, Yifan, et al.
Published: (2024)
by: Zeng, Yifan, et al.
Published: (2024)
In-Situ Behavioral Evaluation for LLM Fairness, Not Standardized-Test Scores
by: Tang, Zeyu, et al.
Published: (2026)
by: Tang, Zeyu, et al.
Published: (2026)
Implicit Behavioral Alignment of Language Agents in High-Stakes Crowd Simulations
by: Wang, Yunzhe, et al.
Published: (2025)
by: Wang, Yunzhe, et al.
Published: (2025)
Characterizing the Consistency of the Emergent Misalignment Persona
by: Weckauff, Anietta, et al.
Published: (2026)
by: Weckauff, Anietta, et al.
Published: (2026)
Similar Items
-
Policy-Governed LLM Routing with Intent Matching for Instrument Laboratories
by: Olowe, Emmanuel A., et al.
Published: (2026) -
Against racing to AGI: Cooperation, deterrence, and catastrophic risks
by: Dung, Leonard, et al.
Published: (2025) -
Misalignment or misuse? The AGI alignment tradeoff
by: Hellrigel-Holderbaum, Max, et al.
Published: (2025) -
Capability-Based Scaling Trends for LLM-Based Red-Teaming
by: Panfilov, Alexander, et al.
Published: (2025) -
Precautionary Governance of Autonomous AI: Legal Personhood as Functional Instrument
by: Brensing, Karsten
Published: (2026)