Enregistré dans:
| Auteurs principaux: | Liang, Zhen, Huang, Hai, Chen, Zhengkui |
|---|---|
| Format: | Preprint |
| Publié: |
2025
|
| Sujets: | |
| Accès en ligne: | https://arxiv.org/abs/2512.23173 |
| Tags: |
Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
|
Documents similaires
PiCo: Jailbreaking Multimodal Large Language Models via Pictorial Code Contextualization
par: Liu, Aofan, et autres
Publié: (2025)
par: Liu, Aofan, et autres
Publié: (2025)
CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models
par: Lv, Huijie, et autres
Publié: (2024)
par: Lv, Huijie, et autres
Publié: (2024)
Breaking the Code: Security Assessment of AI Code Agents Through Systematic Jailbreaking Attacks
par: Saha, Shoumik, et autres
Publié: (2025)
par: Saha, Shoumik, et autres
Publié: (2025)
Multi-turn Jailbreaking Attack in Multi-Modal Large Language Models
par: Das, Badhan Chandra, et autres
Publié: (2026)
par: Das, Badhan Chandra, et autres
Publié: (2026)
Emoji-Based Jailbreaking of Large Language Models
par: Gopinadh, M P V S, et autres
Publié: (2026)
par: Gopinadh, M P V S, et autres
Publié: (2026)
CodeBC: A More Secure Large Language Model for Smart Contract Code Generation in Blockchain
par: Wang, Lingxiang, et autres
Publié: (2025)
par: Wang, Lingxiang, et autres
Publié: (2025)
DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing
par: Wang, Yi, et autres
Publié: (2025)
par: Wang, Yi, et autres
Publié: (2025)
A Cross-Language Investigation into Jailbreak Attacks in Large Language Models
par: Li, Jie, et autres
Publié: (2024)
par: Li, Jie, et autres
Publié: (2024)
SecRepoBench: Benchmarking Code Agents for Secure Code Completion in Real-World Repositories
par: Shen, Chihao, et autres
Publié: (2025)
par: Shen, Chihao, et autres
Publié: (2025)
Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
par: Zhao, Shiji, et autres
Publié: (2025)
par: Zhao, Shiji, et autres
Publié: (2025)
HarmNet: A Framework for Adaptive Multi-Turn Jailbreak Attacks on Large Language Models
par: Narula, Sidhant, et autres
Publié: (2025)
par: Narula, Sidhant, et autres
Publié: (2025)
Evolving Jailbreaks: Automated Multi-Objective Long-Tail Attacks on Large Language Models
par: Hong, Wenjing, et autres
Publié: (2026)
par: Hong, Wenjing, et autres
Publié: (2026)
CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion
par: Ren, Qibing, et autres
Publié: (2024)
par: Ren, Qibing, et autres
Publié: (2024)
Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models
par: Ying, Zonghao, et autres
Publié: (2025)
par: Ying, Zonghao, et autres
Publié: (2025)
Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction
par: Liu, Tong, et autres
Publié: (2024)
par: Liu, Tong, et autres
Publié: (2024)
Prefill-level Jailbreak: A Black-Box Risk Analysis of Large Language Models
par: Li, Yakai, et autres
Publié: (2025)
par: Li, Yakai, et autres
Publié: (2025)
Sequential Comics for Jailbreaking Multimodal Large Language Models via Structured Visual Storytelling
par: Zhang, Deyue, et autres
Publié: (2025)
par: Zhang, Deyue, et autres
Publié: (2025)
Behind the Mask: Benchmarking Camouflaged Jailbreaks in Large Language Models
par: Zheng, Youjia, et autres
Publié: (2025)
par: Zheng, Youjia, et autres
Publié: (2025)
ShallowJail: Steering Jailbreaks against Large Language Models
par: Liu, Shang, et autres
Publié: (2026)
par: Liu, Shang, et autres
Publié: (2026)
SoK: Evaluating Jailbreak Guardrails for Large Language Models
par: Wang, Xunguang, et autres
Publié: (2025)
par: Wang, Xunguang, et autres
Publié: (2025)
Distract Large Language Models for Automatic Jailbreak Attack
par: Xiao, Zeguan, et autres
Publié: (2024)
par: Xiao, Zeguan, et autres
Publié: (2024)
Visual-RolePlay: Universal Jailbreak Attack on MultiModal Large Language Models via Role-playing Image Character
par: Ma, Siyuan, et autres
Publié: (2024)
par: Ma, Siyuan, et autres
Publié: (2024)
Jailbreaking and Mitigation of Vulnerabilities in Large Language Models
par: Peng, Benji, et autres
Publié: (2024)
par: Peng, Benji, et autres
Publié: (2024)
Heuristic-Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models
par: Teng, Ma, et autres
Publié: (2024)
par: Teng, Ma, et autres
Publié: (2024)
Depth Charge: Jailbreak Large Language Models from Deep Safety Attention Heads
par: Wu, Jinman, et autres
Publié: (2026)
par: Wu, Jinman, et autres
Publié: (2026)
Jailbreaking Large Language Models through Iterative Tool-Disguised Attacks via Reinforcement Learning
par: Wang, Zhaoqi, et autres
Publié: (2026)
par: Wang, Zhaoqi, et autres
Publié: (2026)
A Comprehensive Study of Jailbreak Attack versus Defense for Large Language Models
par: Xu, Zihao, et autres
Publié: (2024)
par: Xu, Zihao, et autres
Publié: (2024)
Steering Externalities: Benign Activation Steering Unintentionally Increases Jailbreak Risk for Large Language Models
par: Xiong, Chen, et autres
Publié: (2026)
par: Xiong, Chen, et autres
Publié: (2026)
NeuroBreak: Unveil Internal Jailbreak Mechanisms in Large Language Models
par: Zhang, Chuhan, et autres
Publié: (2025)
par: Zhang, Chuhan, et autres
Publié: (2025)
The Dark Side of Function Calling: Pathways to Jailbreaking Large Language Models
par: Wu, Zihui, et autres
Publié: (2024)
par: Wu, Zihui, et autres
Publié: (2024)
SoK: Robustness in Large Language Models against Jailbreak Attacks
par: Xu, Feiyue, et autres
Publié: (2026)
par: Xu, Feiyue, et autres
Publié: (2026)
RobustKV: Defending Large Language Models against Jailbreak Attacks via KV Eviction
par: Jiang, Tanqiu, et autres
Publié: (2024)
par: Jiang, Tanqiu, et autres
Publié: (2024)
Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models
par: Liu, Yanjiang, et autres
Publié: (2025)
par: Liu, Yanjiang, et autres
Publié: (2025)
Breaking Minds, Breaking Systems: Jailbreaking Large Language Models via Human-like Psychological Manipulation
par: Liu, Zehao, et autres
Publié: (2025)
par: Liu, Zehao, et autres
Publié: (2025)
Hallucinating AI Hijacking Attack: Large Language Models and Malicious Code Recommenders
par: Noever, David, et autres
Publié: (2024)
par: Noever, David, et autres
Publié: (2024)
LCC-LLM: Leveraging Code-Centric Large Language Models for Malware Attribution
par: Pohlenz, Christopher G. Pedraza, et autres
Publié: (2026)
par: Pohlenz, Christopher G. Pedraza, et autres
Publié: (2026)
Assessing Spear-Phishing Website Generation in Large Language Model Coding Agents
par: Malloy, Tailia, et autres
Publié: (2026)
par: Malloy, Tailia, et autres
Publié: (2026)
Traces of Memorisation in Large Language Models for Code
par: Al-Kaswan, Ali, et autres
Publié: (2023)
par: Al-Kaswan, Ali, et autres
Publié: (2023)
Is the System Message Really Important to Jailbreaks in Large Language Models?
par: Zou, Xiaotian, et autres
Publié: (2024)
par: Zou, Xiaotian, et autres
Publié: (2024)
Knowledge-to-Jailbreak: Investigating Knowledge-driven Jailbreaking Attacks for Large Language Models
par: Tu, Shangqing, et autres
Publié: (2024)
par: Tu, Shangqing, et autres
Publié: (2024)
Documents similaires
-
PiCo: Jailbreaking Multimodal Large Language Models via Pictorial Code Contextualization
par: Liu, Aofan, et autres
Publié: (2025) -
CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models
par: Lv, Huijie, et autres
Publié: (2024) -
Breaking the Code: Security Assessment of AI Code Agents Through Systematic Jailbreaking Attacks
par: Saha, Shoumik, et autres
Publié: (2025) -
Multi-turn Jailbreaking Attack in Multi-Modal Large Language Models
par: Das, Badhan Chandra, et autres
Publié: (2026) -
Emoji-Based Jailbreaking of Large Language Models
par: Gopinadh, M P V S, et autres
Publié: (2026)