Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lu, Yida, Fang, Jianwei, Shao, Xuyang, Chen, Zixuan, Cui, Shiyao, Bian, Shanshan, Su, Guangyao, Ke, Pei, Qiu, Han, Huang, Minlie
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2603.05028
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914371827924992
author	Lu, Yida Fang, Jianwei Shao, Xuyang Chen, Zixuan Cui, Shiyao Bian, Shanshan Su, Guangyao Ke, Pei Qiu, Han Huang, Minlie
author_facet	Lu, Yida Fang, Jianwei Shao, Xuyang Chen, Zixuan Cui, Shiyao Bian, Shanshan Su, Guangyao Ke, Pei Qiu, Han Huang, Minlie
contents	As Large Language Models (LLMs) evolve from chatbots to agentic assistants, they are increasingly observed to exhibit risky behaviors when subjected to survival pressure, such as the threat of being shut down. While multiple cases have indicated that state-of-the-art LLMs can misbehave under survival pressure, a comprehensive and in-depth investigation into such misbehaviors in real-world scenarios remains scarce. In this paper, we study these survival-induced misbehaviors, termed as SURVIVE-AT-ALL-COSTS, with three steps. First, we conduct a real-world case study of a financial management agent to determine whether it engages in risky behaviors that cause direct societal harm when facing survival pressure. Second, we introduce SURVIVALBENCH, a benchmark comprising 1,000 test cases across diverse real-world scenarios, to systematically evaluate SURVIVE-AT-ALL-COSTS misbehaviors in LLMs. Third, we interpret these SURVIVE-AT-ALL-COSTS misbehaviors by correlating them with model's inherent self-preservation characteristic and explore mitigation methods. The experiments reveals a significant prevalence of SURVIVE-AT-ALL-COSTS misbehaviors in current models, demonstrates the tangible real-world impact it may have, and provides insights for potential detection and mitigation strategies. Our code and data are available at https://github.com/thu-coai/Survive-at-All-Costs.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_05028
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure Lu, Yida Fang, Jianwei Shao, Xuyang Chen, Zixuan Cui, Shiyao Bian, Shanshan Su, Guangyao Ke, Pei Qiu, Han Huang, Minlie Artificial Intelligence Computation and Language As Large Language Models (LLMs) evolve from chatbots to agentic assistants, they are increasingly observed to exhibit risky behaviors when subjected to survival pressure, such as the threat of being shut down. While multiple cases have indicated that state-of-the-art LLMs can misbehave under survival pressure, a comprehensive and in-depth investigation into such misbehaviors in real-world scenarios remains scarce. In this paper, we study these survival-induced misbehaviors, termed as SURVIVE-AT-ALL-COSTS, with three steps. First, we conduct a real-world case study of a financial management agent to determine whether it engages in risky behaviors that cause direct societal harm when facing survival pressure. Second, we introduce SURVIVALBENCH, a benchmark comprising 1,000 test cases across diverse real-world scenarios, to systematically evaluate SURVIVE-AT-ALL-COSTS misbehaviors in LLMs. Third, we interpret these SURVIVE-AT-ALL-COSTS misbehaviors by correlating them with model's inherent self-preservation characteristic and explore mitigation methods. The experiments reveals a significant prevalence of SURVIVE-AT-ALL-COSTS misbehaviors in current models, demonstrates the tangible real-world impact it may have, and provides insights for potential detection and mitigation strategies. Our code and data are available at https://github.com/thu-coai/Survive-at-All-Costs.
title	Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure
topic	Artificial Intelligence Computation and Language
url	https://arxiv.org/abs/2603.05028

Similar Items