Saved in:
| Main Authors: | Lu, Liming, Gu, Xiang, Pang, Shuchao, Liang, Siyuan, Zhu, Haotian, Zeng, Xiyu, Zheng, Xu, Zhou, Yongbin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.04833 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SafeSteer: Adaptive Subspace Steering for Efficient Jailbreak Defense in Vision-Language Models
by: Zeng, Xiyu, et al.
Published: (2025)
by: Zeng, Xiyu, et al.
Published: (2025)
Multimodal Robust Prompt Distillation for 3D Point Cloud Models
by: Gu, Xiang, et al.
Published: (2025)
by: Gu, Xiang, et al.
Published: (2025)
FERD: Fairness-Enhanced Data-Free Robustness Distillation
by: Li, Zhengxiao, et al.
Published: (2025)
by: Li, Zhengxiao, et al.
Published: (2025)
RoboView-Bias: Benchmarking Visual Bias in Embodied Agents for Robotic Manipulation
by: Liu, Enguang, et al.
Published: (2025)
by: Liu, Enguang, et al.
Published: (2025)
CIARD: Cyclic Iterative Adversarial Robustness Distillation
by: Lu, Liming, et al.
Published: (2025)
by: Lu, Liming, et al.
Published: (2025)
Towards a 3D Transfer-based Black-box Attack via Critical Feature Guidance
by: Pang, Shuchao, et al.
Published: (2025)
by: Pang, Shuchao, et al.
Published: (2025)
DREAM: Dynamic Red-teaming across Environments for AI Models
by: Lu, Liming, et al.
Published: (2025)
by: Lu, Liming, et al.
Published: (2025)
Test-Time Immunization: A Universal Defense Framework Against Jailbreaks for (Multimodal) Large Language Models
by: Yu, Yongcan, et al.
Published: (2025)
by: Yu, Yongcan, et al.
Published: (2025)
Eraser: Jailbreaking Defense in Large Language Models via Unlearning Harmful Knowledge
by: Lu, Weikai, et al.
Published: (2024)
by: Lu, Weikai, et al.
Published: (2024)
Defense-to-Attack: Bypassing Weak Defenses Enables Stronger Jailbreaks in Vision-Language Models
by: Zhao, Yunhan, et al.
Published: (2025)
by: Zhao, Yunhan, et al.
Published: (2025)
Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey
by: Liu, Xuannan, et al.
Published: (2024)
by: Liu, Xuannan, et al.
Published: (2024)
Jailbreaking Attack against Multimodal Large Language Model
by: Niu, Zhenxing, et al.
Published: (2024)
by: Niu, Zhenxing, et al.
Published: (2024)
From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking
by: Wang, Siyuan, et al.
Published: (2024)
by: Wang, Siyuan, et al.
Published: (2024)
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models
by: Jia, Xiaojun, et al.
Published: (2024)
by: Jia, Xiaojun, et al.
Published: (2024)
Jailbreaking Multimodal Large Language Models using Multi-Clip Video
by: Kang, Choongwon, et al.
Published: (2026)
by: Kang, Choongwon, et al.
Published: (2026)
White-box Multimodal Jailbreaks Against Large Vision-Language Models
by: Wang, Ruofan, et al.
Published: (2024)
by: Wang, Ruofan, et al.
Published: (2024)
Efficient LLM-Jailbreaking via Multimodal-LLM Jailbreak
by: Ji, Haoxuan, et al.
Published: (2024)
by: Ji, Haoxuan, et al.
Published: (2024)
Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models
by: Li, Yifan, et al.
Published: (2024)
by: Li, Yifan, et al.
Published: (2024)
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
by: Gu, Xiangming, et al.
Published: (2024)
by: Gu, Xiangming, et al.
Published: (2024)
Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses
by: Zheng, Xiaosen, et al.
Published: (2024)
by: Zheng, Xiaosen, et al.
Published: (2024)
Weak-to-Strong Jailbreaking on Large Language Models
by: Zhao, Xuandong, et al.
Published: (2024)
by: Zhao, Xuandong, et al.
Published: (2024)
Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
by: Zhao, Shiji, et al.
Published: (2025)
by: Zhao, Shiji, et al.
Published: (2025)
Unified Defense for Large Language Models against Jailbreak and Fine-Tuning Attacks in Education
by: Yi, Xin, et al.
Published: (2025)
by: Yi, Xin, et al.
Published: (2025)
Enhancing Content-based Recommendation via Large Language Model
by: Xu, Wentao, et al.
Published: (2024)
by: Xu, Wentao, et al.
Published: (2024)
Single-to-mix Modality Alignment with Multimodal Large Language Model for Document Image Machine Translation
by: Liang, Yupu, et al.
Published: (2025)
by: Liang, Yupu, et al.
Published: (2025)
Res-Bench: Benchmarking the Robustness of Multimodal Large Language Models to Dynamic Resolution Input
by: Li, Chenxu, et al.
Published: (2025)
by: Li, Chenxu, et al.
Published: (2025)
SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models
by: Cheng, Xianfu, et al.
Published: (2025)
by: Cheng, Xianfu, et al.
Published: (2025)
Distraction is All You Need for Multimodal Large Language Model Jailbreaking
by: Yang, Zuopeng, et al.
Published: (2025)
by: Yang, Zuopeng, et al.
Published: (2025)
GAMBIT: A Gamified Jailbreak Framework for Multimodal Large Language Models
by: Hu, Xiangdong, et al.
Published: (2026)
by: Hu, Xiangdong, et al.
Published: (2026)
Reconstruction of Differentially Private Text Sanitization via Large Language Models
by: Pang, Shuchao, et al.
Published: (2024)
by: Pang, Shuchao, et al.
Published: (2024)
Cross-modality Information Check for Detecting Jailbreaking in Multimodal Large Language Models
by: Xu, Yue, et al.
Published: (2024)
by: Xu, Yue, et al.
Published: (2024)
Medical MLLM is Vulnerable: Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models
by: Huang, Xijie, et al.
Published: (2024)
by: Huang, Xijie, et al.
Published: (2024)
OFFSIDE: Benchmarking Unlearning Misinformation in Multimodal Large Language Models
by: Zheng, Hao, et al.
Published: (2025)
by: Zheng, Hao, et al.
Published: (2025)
OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation
by: Jia, Xiaojun, et al.
Published: (2025)
by: Jia, Xiaojun, et al.
Published: (2025)
Test-Time Backdoor Attacks on Multimodal Large Language Models
by: Lu, Dong, et al.
Published: (2024)
by: Lu, Dong, et al.
Published: (2024)
AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-based Mobile GUI Agents
by: Luo, Yutao, et al.
Published: (2026)
by: Luo, Yutao, et al.
Published: (2026)
Towards Visual Text Grounding of Multimodal Large Language Model
by: Li, Ming, et al.
Published: (2025)
by: Li, Ming, et al.
Published: (2025)
UniCode: Learning a Unified Codebook for Multimodal Large Language Models
by: Zheng, Sipeng, et al.
Published: (2024)
by: Zheng, Sipeng, et al.
Published: (2024)
Vision-Centric Activation and Coordination for Multimodal Large Language Models
by: Wang, Yunnan, et al.
Published: (2025)
by: Wang, Yunnan, et al.
Published: (2025)
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
by: Wang, Weiyun, et al.
Published: (2024)
by: Wang, Weiyun, et al.
Published: (2024)
Similar Items
-
SafeSteer: Adaptive Subspace Steering for Efficient Jailbreak Defense in Vision-Language Models
by: Zeng, Xiyu, et al.
Published: (2025) -
Multimodal Robust Prompt Distillation for 3D Point Cloud Models
by: Gu, Xiang, et al.
Published: (2025) -
FERD: Fairness-Enhanced Data-Free Robustness Distillation
by: Li, Zhengxiao, et al.
Published: (2025) -
RoboView-Bias: Benchmarking Visual Bias in Embodied Agents for Robotic Manipulation
by: Liu, Enguang, et al.
Published: (2025) -
CIARD: Cyclic Iterative Adversarial Robustness Distillation
by: Lu, Liming, et al.
Published: (2025)