:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Hong, Wenjing, Rong, Zhonghua, Wang, Li, Chang, Feng, Zhu, Jian, Tang, Ke, Zhu, Zexuan, Ong, Yew-Soon
Format:	Preprint
Published:	2026
Subjects:	Cryptography and Security Artificial Intelligence
Online Access:	https://arxiv.org/abs/2603.20122
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Evolving Skill-Structured Attack Memory Enhances LLM Jailbreaking
by: Zhang, Junke, et al.
Published: (2026)

Untargeted Jailbreak Attack
by: Huang, Xinzhe, et al.
Published: (2025)

JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models
by: Feng, Yingchaojie, et al.
Published: (2024)

Evolving Security in LLMs: A Study of Jailbreak Attacks and Defenses
by: Shang, Zhengchun, et al.
Published: (2025)

Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs
by: Chen, Yunhao, et al.
Published: (2025)

SoK: Robustness in Large Language Models against Jailbreak Attacks
by: Xu, Feiyue, et al.
Published: (2026)

Safe Delta: Consistently Preserving Safety when Fine-Tuning LLMs on Diverse Datasets
by: Lu, Ning, et al.
Published: (2025)

StructuralSleight: Automated Jailbreak Attacks on Large Language Models Utilizing Uncommon Text-Organization Structures
by: Li, Bangxin, et al.
Published: (2024)

VulReaD: Knowledge-Graph-guided Software Vulnerability Reasoning and Detection
by: Mukhtar, Samal, et al.
Published: (2026)

Multi-turn Jailbreaking Attack in Multi-Modal Large Language Models
by: Das, Badhan Chandra, et al.
Published: (2026)

Chain-of-Lure: A Universal Jailbreak Attack Framework using Unconstrained Synthetic Narratives
by: Chang, Wenhan, et al.
Published: (2025)

SRTJ: Self-Evolving Rule-Driven Training-Free LLM Jailbreaking
by: Li, Jindong, et al.
Published: (2026)

Towards Robust Multimodal Large Language Models Against Jailbreak Attacks
by: Yin, Ziyi, et al.
Published: (2025)

JailbreaksOverTime: Detecting Jailbreak Attacks Under Distribution Shift
by: Piet, Julien, et al.
Published: (2025)

Tit-for-Tat: Safeguarding Large Vision-Language Models Against Jailbreak Attacks via Adversarial Defense
by: Hao, Shuyang, et al.
Published: (2025)

Mitigating Many-shot Jailbreak Attacks with One Single Demonstration
by: Chen, Kejia, et al.
Published: (2026)

TRACE: Task-Aware Adaptive Self-Evolving Agentic Jailbreaking
by: Zeng, Churui, et al.
Published: (2026)

AdvPrefix: An Objective for Nuanced LLM Jailbreaks
by: Zhu, Sicheng, et al.
Published: (2024)

MultiKG: Multi-Source Threat Intelligence Aggregation for High-Quality Knowledge Graph Representation of Attack Techniques
by: Wang, Jian, et al.
Published: (2024)

PolyJailbreak: Cross-Modal Jailbreaking Attacks on Black-Box Multimodal LLMs
by: Wang, Xinkai, et al.
Published: (2025)

AutoJailbreak: Exploring Jailbreak Attacks and Defenses through a Dependency Lens
by: Lu, Lin, et al.
Published: (2024)

JPRO: Automated Multimodal Jailbreaking via Multi-Agent Collaboration Framework
by: Zhou, Yuxuan, et al.
Published: (2025)

From LLMs to MLLMs to Agents: A Survey of Emerging Paradigms in Jailbreak Attacks and Defenses within LLM Ecosystem
by: Mao, Yanxu, et al.
Published: (2025)

Align is not Enough: Multimodal Universal Jailbreak Attack against Multimodal Large Language Models
by: Wang, Youze, et al.
Published: (2025)

MasterKey: Automated Jailbreak Across Multiple Large Language Model Chatbots
by: Deng, Gelei, et al.
Published: (2023)

AutoAdv: Automated Adversarial Prompting for Multi-Turn Jailbreaking of Large Language Models
by: Reddy, Aashray, et al.
Published: (2025)

When LLMs Team Up: A Coordinated Attack Framework for Automated Cyber Intrusions
by: Qi, Minfeng, et al.
Published: (2026)

Systematic Scaling Analysis of Jailbreak Attacks in Large Language Models
by: Wang, Xiangwen, et al.
Published: (2026)

HarmNet: A Framework for Adaptive Multi-Turn Jailbreak Attacks on Large Language Models
by: Narula, Sidhant, et al.
Published: (2025)

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models
by: Park, Junyoung, et al.
Published: (2026)

ASTRA: An Automated Framework for Strategy Discovery, Retrieval, and Evolution for Jailbreaking LLMs
by: Liu, Xu, et al.
Published: (2025)

Multi-Objective Reinforcement Learning for Automated Resilient Cyber Defence
by: O'Driscoll, Ross, et al.
Published: (2024)

Large Language Model Adversarial Landscape Through the Lens of Attack Objectives
by: Wang, Nan, et al.
Published: (2025)

Jailbreaking Attack against Multimodal Large Language Model
by: Niu, Zhenxing, et al.
Published: (2024)

Enhanced MLLM Black-Box Jailbreaking Attacks and Defenses
by: Zhong, Xingwei, et al.
Published: (2025)

Knowledge-to-Jailbreak: Investigating Knowledge-driven Jailbreaking Attacks for Large Language Models
by: Tu, Shangqing, et al.
Published: (2024)

From Head to Tail: Efficient Black-box Model Inversion Attack via Long-tailed Learning
by: Li, Ziang, et al.
Published: (2025)

Invisible to Humans, Triggered by Agents: Stealthy Jailbreak Attacks on Mobile Vision-Language Agents
by: Ding, Renhua, et al.
Published: (2025)

Can Small Language Models Reliably Resist Jailbreak Attacks? A Comprehensive Evaluation
by: Zhang, Wenhui, et al.
Published: (2025)

Jailbreaking Leaves a Trace: Understanding and Detecting Jailbreak Attacks from Internal Representations of Large Language Models
by: Kadali, Sri Durga Sai Sowmya, et al.
Published: (2026)