:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ye, Xi, Liu, Yiwen, Wang, Lina, Wang, Run, Yang, Geying, Hou, Yufei, Yu, Jiayi
Format:	Preprint
Published:	2026
Subjects:	Cryptography and Security
Online Access:	https://arxiv.org/abs/2601.07141
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Metaphor-based Jailbreak Attacks on Text-to-Image Models
by: Zhang, Chenyu, et al.
Published: (2025)

Reason2Attack: Jailbreaking Text-to-Image Models via LLM Reasoning
by: Zhang, Chenyu, et al.
Published: (2025)

One Model Transfer to All: On Robust Jailbreak Prompts Generation against LLMs
by: Li, Linbao, et al.
Published: (2025)

Jailbreaking Prompt Attack: A Controllable Adversarial Attack against Diffusion Models
by: Ma, Jiachen, et al.
Published: (2024)

Towards Effective Prompt Stealing Attack against Text-to-Image Diffusion Models
by: Zhao, Shiqian, et al.
Published: (2025)

Defensive Prompt Patch: A Robust and Interpretable Defense of LLMs against Jailbreak Attacks
by: Xiong, Chen, et al.
Published: (2024)

AEIOU: A Unified Defense Framework against NSFW Prompts in Text-to-Image Models
by: Wang, Yiming, et al.
Published: (2024)

Universally Unfiltered and Unseen:Input-Agnostic Multimodal Jailbreaks against Text-to-Image Model Safeguards
by: Yan, Song, et al.
Published: (2025)

Deciphering the Chaos: Enhancing Jailbreak Attacks via Adversarial Prompt Translation
by: Li, Qizhang, et al.
Published: (2024)

Enhancing Jailbreak Attacks on LLMs via Persona Prompts
by: Zhang, Zheng, et al.
Published: (2025)

Defending Jailbreak Prompts via In-Context Adversarial Game
by: Zhou, Yujun, et al.
Published: (2024)

Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models
by: Liu, Jiangtao, et al.
Published: (2025)

EVA: Editing for Versatile Alignment against Jailbreaks
by: Wang, Yi, et al.
Published: (2026)

Proactive defense against LLM Jailbreak
by: Zhao, Weiliang, et al.
Published: (2025)

On the Proactive Generation of Unsafe Images From Text-To-Image Models Using Benign Prompts
by: Wu, Yixin, et al.
Published: (2023)

Token Highlighter: Inspecting and Mitigating Jailbreak Prompts for Large Language Models
by: Hu, Xiaomeng, et al.
Published: (2024)

Model-Editing-Based Jailbreak against Safety-aligned Large Language Models
by: Li, Yuxi, et al.
Published: (2024)

Acoustic Interference: A New Paradigm Weaponizing Acoustic Latent Semantic for Universal Jailbreak against Large Audio Language Models
by: Wang, Yanyun, et al.
Published: (2026)

HTS-Attack: Heuristic Token Search for Jailbreaking Text-to-Image Models
by: Gao, Sensen, et al.
Published: (2024)

ShallowJail: Steering Jailbreaks against Large Language Models
by: Liu, Shang, et al.
Published: (2026)

Combinational Backdoor Attack against Customized Text-to-Image Models
by: Jiang, Wenbo, et al.
Published: (2024)

GeneBreaker: Jailbreak Attacks against DNA Language Models with Pathogenicity Guidance
by: Zhang, Zaixi, et al.
Published: (2025)

SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner
by: Wang, Xunguang, et al.
Published: (2024)

Align is not Enough: Multimodal Universal Jailbreak Attack against Multimodal Large Language Models
by: Wang, Youze, et al.
Published: (2025)

SoK: Robustness in Large Language Models against Jailbreak Attacks
by: Xu, Feiyue, et al.
Published: (2026)

Imperceptible Jailbreaking against Large Language Models
by: Gao, Kuofeng, et al.
Published: (2025)

ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks
by: Zhuang, Zhixiong, et al.
Published: (2025)

Let the Bees Find the Weak Spots: A Path Planning Perspective on Multi-Turn Jailbreak Attacks against LLMs
by: Liu, Yize, et al.
Published: (2025)

Don't Listen To Me: Understanding and Exploring Jailbreak Prompts of Large Language Models
by: Yu, Zhiyuan, et al.
Published: (2024)

PLA: Prompt Learning Attack against Text-to-Image Generative Models
by: Lyu, Xinqi, et al.
Published: (2025)

Towards Action Hijacking of Large Language Model-based Agent
by: Zhang, Yuyang, et al.
Published: (2024)

OrchJail: Jailbreaking Tool-Calling Text-to-Image Agents by Orchestration-Guided Fuzzing
by: Chen, Jianming, et al.
Published: (2026)

Knowledge-to-Jailbreak: Investigating Knowledge-driven Jailbreaking Attacks for Large Language Models
by: Tu, Shangqing, et al.
Published: (2024)

Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs
by: Chen, Yunhao, et al.
Published: (2025)

Formalization Driven LLM Prompt Jailbreaking via Reinforcement Learning
by: Wang, Zhaoqi, et al.
Published: (2025)

Automatic Jailbreaking of the Text-to-Image Generative AI Systems
by: Kim, Minseon, et al.
Published: (2024)

MasterKey: Automated Jailbreak Across Multiple Large Language Model Chatbots
by: Deng, Gelei, et al.
Published: (2023)

Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction
by: Chen, Yulin, et al.
Published: (2025)

Transfer Learning of Real Image Features with Soft Contrastive Loss for Fake Image Detection
by: Liang, Ziyou, et al.
Published: (2024)

$PC^2$: Politically Controversial Content Generation via Jailbreaking Attacks on GPT-based Text-to-Image Models
by: Choi, Wonwoo, et al.
Published: (2026)