:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Yunyi, Cui, Shibo, Liu, Baojun, Yu, Jingkai, Zhang, Min, Shi, Fan, Zheng, Han
Format:	Preprint
Published:	2025
Subjects:	Cryptography and Security
Online Access:	https://arxiv.org/abs/2511.17874
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Into the Gray Zone: Domain Contexts Can Blur LLM Safety Boundaries
by: Hung, Ki Sen, et al.
Published: (2026)

You Can't Eat Your Cake and Have It Too: The Performance Degradation of LLMs with Jailbreak Defense
by: Mai, Wuyuao, et al.
Published: (2025)

Unveiling the Resilience of LLM-Enhanced Search Engines against Black-Hat SEO Manipulation
by: Chen, Pei, et al.
Published: (2026)

Sugar-Coated Poison: Benign Generation Unlocks LLM Jailbreaking
by: Wu, Yu-Hang, et al.
Published: (2025)

NeuroBreak: Unveil Internal Jailbreak Mechanisms in Large Language Models
by: Zhang, Chuhan, et al.
Published: (2025)

Sparse Autoencoders are Capable LLM Jailbreak Mitigators
by: Assogba, Yannick, et al.
Published: (2026)

Beyond Surface-Level Patterns: An Essence-Driven Defense Framework Against Jailbreak Attacks in LLMs
by: Xiang, Shiyu, et al.
Published: (2025)

Lurking in the shadows: Unveiling Stealthy Backdoor Attacks against Personalized Federated Learning
by: Lyu, Xiaoting, et al.
Published: (2024)

Beyond Jailbreaks: Revealing Stealthier and Broader LLM Security Risks Stemming from Alignment Failures
by: Zhou, Yukai, et al.
Published: (2025)

Beyond Fixed and Dynamic Prompts: Embedded Jailbreak Templates for Advancing LLM Security
by: Kim, Hajun, et al.
Published: (2025)

JailbreakLens: Interpreting Jailbreak Mechanism in the Lens of Representation and Circuit
by: He, Zeqing, et al.
Published: (2024)

From LLMs to MLLMs to Agents: A Survey of Emerging Paradigms in Jailbreak Attacks and Defenses within LLM Ecosystem
by: Mao, Yanxu, et al.
Published: (2025)

The Art of the Jailbreak: Formulating Jailbreak Attacks for LLM Security Beyond Binary Scoring
by: Hossain, Ismail, et al.
Published: (2026)

Evolving Skill-Structured Attack Memory Enhances LLM Jailbreaking
by: Zhang, Junke, et al.
Published: (2026)

AutoJailbreak: Exploring Jailbreak Attacks and Defenses through a Dependency Lens
by: Lu, Lin, et al.
Published: (2024)

Beyond Jailbreaking: Auditing Contextual Privacy in LLM Agents
by: Das, Saswat, et al.
Published: (2025)

Backdoors in RLVR: Jailbreak Backdoors in LLMs From Verifiable Reward
by: Guo, Weiyang, et al.
Published: (2026)

AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender
by: Zhao, Weixiang, et al.
Published: (2025)

SpatialJB: How Text Distribution Art Becomes the "Jailbreak Key" for LLM Guardrails
by: Mou, Zhiyi, et al.
Published: (2026)

Proactive defense against LLM Jailbreak
by: Zhao, Weiliang, et al.
Published: (2025)

When LLM Meets DRL: Advancing Jailbreaking Efficiency via DRL-guided Search
by: Chen, Xuan, et al.
Published: (2024)

Formalization Driven LLM Prompt Jailbreaking via Reinforcement Learning
by: Wang, Zhaoqi, et al.
Published: (2025)

Unveiling Privacy Risks in LLM Agent Memory
by: Wang, Bo, et al.
Published: (2025)

Beyond Text: Unveiling Privacy Vulnerabilities in Multi-modal Retrieval-Augmented Generation
by: Zhang, Jiankun, et al.
Published: (2025)

CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges
by: Li, Yu, et al.
Published: (2025)

ASTRA: An Automated Framework for Strategy Discovery, Retrieval, and Evolution for Jailbreaking LLMs
by: Liu, Xu, et al.
Published: (2025)

LITMUS: Benchmarking Behavioral Jailbreaks of LLM Agents in Real OS Environments
by: Zhang, Chiyu, et al.
Published: (2026)

Bleeding Pathways: Vanishing Discriminability in LLM Hidden States Fuels Jailbreak Attacks
by: Zhang, Yingjie, et al.
Published: (2025)

Beyond Model Jailbreak: Systematic Dissection of the "Ten DeadlySins" in Embodied Intelligence
by: Huang, Yuhang, et al.
Published: (2025)

Profiling for Pennies: Unveiling the Privacy Iceberg of LLM Agents
by: Chen, Jiahao, et al.
Published: (2026)

Pandora: Jailbreak GPTs by Retrieval Augmented Generation Poisoning
by: Deng, Gelei, et al.
Published: (2024)

SQL Injection Jailbreak: A Structural Disaster of Large Language Models
by: Zhao, Jiawei, et al.
Published: (2024)

TPM2.0-Supported Runtime Customizable TEE on FPGA-SoC with User-Controllable vTPM
by: Mao, Jingkai, et al.
Published: (2025)

PDRIMA: A Policy-Driven Runtime Integrity Measurement and Attestation Approach for ARM TrustZone-based TEE
by: Mao, Jingkai, et al.
Published: (2025)

How Real is Your Jailbreak? Fine-grained Jailbreak Evaluation with Anchored Reference
by: Liu, Songyang, et al.
Published: (2026)

FuzzLLM: A Novel and Universal Fuzzing Framework for Proactively Discovering Jailbreak Vulnerabilities in Large Language Models
by: Yao, Dongyu, et al.
Published: (2023)

LLM-Virus: Evolutionary Jailbreak Attack on Large Language Models
by: Yu, Miao, et al.
Published: (2024)

The Human-Machine Identity Blur: A Unified Framework for Cybersecurity Risk Management in 2025
by: Janani, Kush
Published: (2025)

Geneshift: Impact of different scenario shift on Jailbreaking LLM
by: Wu, Tianyi, et al.
Published: (2025)

Enhancing Jailbreak Attacks on LLMs via Persona Prompts
by: Zhang, Zheng, et al.
Published: (2025)