Saved in:
| Main Authors: | Ying, Zonghao, Dai, Haowen, Hu, Lianyu, Jing, Zonglei, Zou, Quanchen, Yang, Yaodong, Liu, Aishan, Liu, Xianglong |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.05853 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SPARK: Jailbreaking T2V Models by Synergistically Prompting Auditory and Recontextualized Knowledge
by: Ying, Zonghao, et al.
Published: (2025)
by: Ying, Zonghao, et al.
Published: (2025)
CogMorph: Cognitive Morphing Attacks for Text-to-Image Models
by: Jing, Zonglei, et al.
Published: (2025)
by: Jing, Zonglei, et al.
Published: (2025)
Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models
by: Ying, Zonghao, et al.
Published: (2025)
by: Ying, Zonghao, et al.
Published: (2025)
Unveiling the Safety of GPT-4o: An Empirical Study using Jailbreak Attacks
by: Ying, Zonghao, et al.
Published: (2024)
by: Ying, Zonghao, et al.
Published: (2024)
Evolving Deception: When Agents Evolve, Deception Wins
by: Ying, Zonghao, et al.
Published: (2026)
by: Ying, Zonghao, et al.
Published: (2026)
RoboSafe: Safeguarding Embodied Agents via Executable Safety Logic
by: Wang, Le, et al.
Published: (2025)
by: Wang, Le, et al.
Published: (2025)
PromptSafe: Gated Prompt Tuning for Safe Text-to-Image Generation
by: Jing, Zonglei, et al.
Published: (2025)
by: Jing, Zonglei, et al.
Published: (2025)
AgentVisor: Defending LLM Agents Against Prompt Injection via Semantic Virtualization
by: Ying, Zonghao, et al.
Published: (2026)
by: Ying, Zonghao, et al.
Published: (2026)
TrajShield: Trajectory-Level Safety Mediation for Defending Text-to-Video Models Against Jailbreak Attacks
by: Zou, Quanchen, et al.
Published: (2026)
by: Zou, Quanchen, et al.
Published: (2026)
Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt
by: Ying, Zonghao, et al.
Published: (2024)
by: Ying, Zonghao, et al.
Published: (2024)
Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?
by: Mu, Junjie, et al.
Published: (2025)
by: Mu, Junjie, et al.
Published: (2025)
PRJ: Perception-Retrieval-Judgement for Generated Images
by: Fu, Qiang, et al.
Published: (2025)
by: Fu, Qiang, et al.
Published: (2025)
Probabilistic Modeling of Jailbreak on Multimodal LLMs: From Quantification to Application
by: Xu, Wenzhuo, et al.
Published: (2025)
by: Xu, Wenzhuo, et al.
Published: (2025)
PRISM: Programmatic Reasoning with Image Sequence Manipulation for LVLM Jailbreaking
by: Zou, Quanchen, et al.
Published: (2025)
by: Zou, Quanchen, et al.
Published: (2025)
SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents
by: Ying, Zonghao, et al.
Published: (2025)
by: Ying, Zonghao, et al.
Published: (2025)
Two Frames Matter: A Temporal Attack for Text-to-Video Model Jailbreaking
by: Chen, Moyang, et al.
Published: (2026)
by: Chen, Moyang, et al.
Published: (2026)
Towards Understanding the Safety Boundaries of DeepSeek Models: Evaluation and Findings
by: Ying, Zonghao, et al.
Published: (2025)
by: Ying, Zonghao, et al.
Published: (2025)
Manipulating Multimodal Agents via Cross-Modal Prompt Injection
by: Wang, Le, et al.
Published: (2025)
by: Wang, Le, et al.
Published: (2025)
Adversarial Generation and Collaborative Evolution of Safety-Critical Scenarios for Autonomous Vehicles
by: Liu, Jiangfan, et al.
Published: (2025)
by: Liu, Jiangfan, et al.
Published: (2025)
DMN: A Compositional Framework for Jailbreaking Multimodal LLMs with Multi-Image Inputs
by: Xu, Wenzhuo, et al.
Published: (2026)
by: Xu, Wenzhuo, et al.
Published: (2026)
Towards Robust Physical-world Backdoor Attacks on Lane Detection
by: Zhang, Xinwei, et al.
Published: (2024)
by: Zhang, Xinwei, et al.
Published: (2024)
T2V-OptJail: Discrete Prompt Optimization for Text-to-Video Jailbreak Attacks
by: Liu, Jiayang, et al.
Published: (2025)
by: Liu, Jiayang, et al.
Published: (2025)
Reading Between the Pixels: Linking Text-Image Embedding Alignment to Typographic Attack Success on Vision-Language Models
by: Balakrishnan, Ravikumar, et al.
Published: (2026)
by: Balakrishnan, Ravikumar, et al.
Published: (2026)
Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing
by: Xiao, Yisong, et al.
Published: (2025)
by: Xiao, Yisong, et al.
Published: (2025)
Black-Box Adversarial Attack on Vision Language Models for Autonomous Driving
by: Wang, Lu, et al.
Published: (2025)
by: Wang, Lu, et al.
Published: (2025)
Metaphor-based Jailbreak Attacks on Text-to-Image Models
by: Zhang, Chenyu, et al.
Published: (2025)
by: Zhang, Chenyu, et al.
Published: (2025)
HTS-Attack: Heuristic Token Search for Jailbreaking Text-to-Image Models
by: Gao, Sensen, et al.
Published: (2024)
by: Gao, Sensen, et al.
Published: (2024)
Reasoning-Oriented Programming: Chaining Semantic Gadgets to Jailbreak Large Vision Language Models
by: Zou, Quanchen, et al.
Published: (2026)
by: Zou, Quanchen, et al.
Published: (2026)
Visual Adversarial Attack on Vision-Language Models for Autonomous Driving
by: Zhang, Tianyuan, et al.
Published: (2024)
by: Zhang, Tianyuan, et al.
Published: (2024)
DIVER: Dynamic Iterative Visual Evidence Reasoning for Multimodal Fake News Detection
by: Zhou, Weilin, et al.
Published: (2026)
by: Zhou, Weilin, et al.
Published: (2026)
Uncovering Strategic Egoism Behaviors in Large Language Models
by: Zhang, Yaoyuan, et al.
Published: (2025)
by: Zhang, Yaoyuan, et al.
Published: (2025)
Multi-Turn Context Jailbreak Attack on Large Language Models From First Principles
by: Sun, Xiongtao, et al.
Published: (2024)
by: Sun, Xiongtao, et al.
Published: (2024)
Reason2Attack: Jailbreaking Text-to-Image Models via LLM Reasoning
by: Zhang, Chenyu, et al.
Published: (2025)
by: Zhang, Chenyu, et al.
Published: (2025)
GuardAD: Safeguarding Autonomous Driving MLLMs via Markovian Safety Logic
by: Zhang, Tianyuan, et al.
Published: (2026)
by: Zhang, Tianyuan, et al.
Published: (2026)
Improving Continuous Sign Language Recognition with Adapted Image Models
by: Hu, Lianyu, et al.
Published: (2024)
by: Hu, Lianyu, et al.
Published: (2024)
Perception-guided Jailbreak against Text-to-Image Models
by: Huang, Yihao, et al.
Published: (2024)
by: Huang, Yihao, et al.
Published: (2024)
Low-Effort Jailbreak Attacks Against Text-to-Image Safety Filters
by: Mustafa, Ahmed B, et al.
Published: (2026)
by: Mustafa, Ahmed B, et al.
Published: (2026)
Exploring Semantic-constrained Adversarial Example with Instruction Uncertainty Reduction
by: Hu, Jin, et al.
Published: (2025)
by: Hu, Jin, et al.
Published: (2025)
SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models
by: Ying, Zonghao, et al.
Published: (2024)
by: Ying, Zonghao, et al.
Published: (2024)
Bench2ADVLM: A Closed-Loop Benchmark for Vision-language Models in Autonomous Driving
by: Zhang, Tianyuan, et al.
Published: (2025)
by: Zhang, Tianyuan, et al.
Published: (2025)
Similar Items
-
SPARK: Jailbreaking T2V Models by Synergistically Prompting Auditory and Recontextualized Knowledge
by: Ying, Zonghao, et al.
Published: (2025) -
CogMorph: Cognitive Morphing Attacks for Text-to-Image Models
by: Jing, Zonglei, et al.
Published: (2025) -
Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models
by: Ying, Zonghao, et al.
Published: (2025) -
Unveiling the Safety of GPT-4o: An Empirical Study using Jailbreak Attacks
by: Ying, Zonghao, et al.
Published: (2024) -
Evolving Deception: When Agents Evolve, Deception Wins
by: Ying, Zonghao, et al.
Published: (2026)