:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ahn, Yelim, Lee, Jaejin
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence Cryptography and Security
Online Access:	https://arxiv.org/abs/2508.01306
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Alphabet Index Mapping: Jailbreaking LLMs through Semantic Dissimilarity
by: Husain, Bilal Saleh
Published: (2025)

Exploring Jailbreak Attacks on LLMs through Intent Concealment and Diversion
by: Cui, Tiehan, et al.
Published: (2025)

FlipAttack: Jailbreak LLMs via Flipping
by: Liu, Yue, et al.
Published: (2024)

Enhancing Jailbreak Attacks on LLMs via Persona Prompts
by: Zhang, Zheng, et al.
Published: (2025)

LLMs Caught in the Crossfire: Malware Requests and Jailbreak Challenges
by: Li, Haoyang, et al.
Published: (2025)

Analysis of LLMs Against Prompt Injection and Jailbreak Attacks
by: Jaiswal, Piyush, et al.
Published: (2026)

Re-Triggering Safeguards within LLMs for Jailbreak Detection
by: Lin, Zheng, et al.
Published: (2026)

Evolving Security in LLMs: A Study of Jailbreak Attacks and Defenses
by: Shang, Zhengchun, et al.
Published: (2025)

PAPILLON: Efficient and Stealthy Fuzz Testing-Powered Jailbreaks for LLMs
by: Gong, Xueluan, et al.
Published: (2024)

Few-Shot Truly Benign DPO Attack for Jailbreaking LLMs
by: Yoon, Sangyeon, et al.
Published: (2026)

Backdoors in RLVR: Jailbreak Backdoors in LLMs From Verifiable Reward
by: Guo, Weiyang, et al.
Published: (2026)

Feint and Attack: Attention-Based Strategies for Jailbreaking and Protecting LLMs
by: Pu, Rui, et al.
Published: (2024)

Data to Defense: The Role of Curation in Customizing LLMs Against Jailbreaking Attacks
by: Liu, Xiaoqun, et al.
Published: (2024)

"To Survive, I Must Defect": Jailbreaking LLMs via the Game-Theory Scenarios
by: Sun, Zhen, et al.
Published: (2025)

DMN: A Compositional Framework for Jailbreaking Multimodal LLMs with Multi-Image Inputs
by: Xu, Wenzhuo, et al.
Published: (2026)

Emoji-Based Jailbreaking of Large Language Models
by: Gopinadh, M P V S, et al.
Published: (2026)

PPMI: Privacy-Preserving LLM Interaction with Socratic Chain-of-Thought Reasoning and Homomorphically Encrypted Vector Databases
by: Bae, Yubeen, et al.
Published: (2025)

CAVGAN: Unifying Jailbreak and Defense of LLMs via Generative Adversarial Attacks on their Internal Representations
by: Li, Xiaohu, et al.
Published: (2025)

Confusion is the Final Barrier: Rethinking Jailbreak Evaluation and Investigating the Real Misuse Threat of LLMs
by: Yan, Yu, et al.
Published: (2025)

Bidirectional Intention Inference Enhances LLMs' Defense Against Multi-Turn Jailbreak Attacks
by: Tong, Haibo, et al.
Published: (2025)

SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner
by: Wang, Xunguang, et al.
Published: (2024)

Injecting Universal Jailbreak Backdoors into LLMs in Minutes
by: Chen, Zhuowei, et al.
Published: (2025)

Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs
by: Xu, Zhao, et al.
Published: (2024)

Automatic Jailbreaking of the Text-to-Image Generative AI Systems
by: Kim, Minseon, et al.
Published: (2024)

One Leak Away: How Pretrained Model Exposure Amplifies Jailbreak Risks in Finetuned LLMs
by: Tan, Yixin, et al.
Published: (2025)

LLMs Can Defend Themselves Against Jailbreaking in a Practical Manner: A Vision Paper
by: Wu, Daoyuan, et al.
Published: (2024)

Sirens' Whisper: Inaudible Near-Ultrasonic Jailbreaks of Speech-Driven LLMs
by: Ling, Zijian, et al.
Published: (2026)

Untargeted Jailbreak Attack
by: Huang, Xinzhe, et al.
Published: (2025)

Adversarial Tuning: Defending Against Jailbreak Attacks for LLMs
by: Liu, Fan, et al.
Published: (2024)

Can LLMs Deeply Detect Complex Malicious Queries? A Framework for Jailbreaking via Obfuscating Intent
by: Shang, Shang, et al.
Published: (2024)

JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs
by: Chu, Junjie, et al.
Published: (2024)

Evaluating Jailbreaking Vulnerabilities in LLMs Deployed as Assistants for Smart Grid Operations: A Benchmark Against NERC Standards
by: Hammadia, Taha, et al.
Published: (2026)

JailPO: A Novel Black-box Jailbreak Framework via Preference Optimization against Aligned LLMs
by: Li, Hongyi, et al.
Published: (2024)

Jailbreaking Large Language Models through Iterative Tool-Disguised Attacks via Reinforcement Learning
by: Wang, Zhaoqi, et al.
Published: (2026)

RePD: Defending Jailbreak Attack through a Retrieval-based Prompt Decomposition Process
by: Wang, Peiran, et al.
Published: (2024)

Jailbreaking LLMs via Calibration
by: Lu, Yuxuan, et al.
Published: (2026)

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models
by: Park, Junyoung, et al.
Published: (2026)

Graph of Attacks: Improved Black-Box and Interpretable Jailbreaks for LLMs
by: Akbar-Tajari, Mohammad, et al.
Published: (2025)

bi-GRPO: Bidirectional Optimization for Jailbreak Backdoor Injection on LLMs
by: Ji, Wence, et al.
Published: (2025)

SeqAR: Jailbreak LLMs with Sequential Auto-Generated Characters
by: Yang, Yan, et al.
Published: (2024)