Saved in:
| Main Authors: | Fan, Yihe, Cao, Yuxin, Zhao, Ziyu, Liu, Ziyao, Li, Shaofeng |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.05264 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LogoStyleFool: Vitiating Video Recognition Systems via Logo Style Transfer
by: Cao, Yuxin, et al.
Published: (2023)
by: Cao, Yuxin, et al.
Published: (2023)
Secure and Robust Watermarking for AI-generated Images: A Comprehensive Survey
by: Cao, Jie, et al.
Published: (2025)
by: Cao, Jie, et al.
Published: (2025)
Universally Unfiltered and Unseen:Input-Agnostic Multimodal Jailbreaks against Text-to-Image Model Safeguards
by: Yan, Song, et al.
Published: (2025)
by: Yan, Song, et al.
Published: (2025)
VideoSTF: Stress-Testing Output Repetition in Video Large Language Models
by: Cao, Yuxin, et al.
Published: (2026)
by: Cao, Yuxin, et al.
Published: (2026)
A Survey of Safety on Large Vision-Language Models: Attacks, Defenses and Evaluations
by: Ye, Mang, et al.
Published: (2025)
by: Ye, Mang, et al.
Published: (2025)
Evaluating the Efficacy of Prompt-Engineered Large Multimodal Models Versus Fine-Tuned Vision Transformers in Image-Based Security Applications
by: Trad, Fouad, et al.
Published: (2024)
by: Trad, Fouad, et al.
Published: (2024)
Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images
by: Gao, Kuofeng, et al.
Published: (2024)
by: Gao, Kuofeng, et al.
Published: (2024)
Image Corruption-Inspired Membership Inference Attacks against Large Vision-Language Models
by: Wu, Zongyu, et al.
Published: (2025)
by: Wu, Zongyu, et al.
Published: (2025)
Scaling Exposes the Trigger: Input-Level Backdoor Detection in Text-to-Image Diffusion Models via Cross-Attention Scaling
by: Li, Zida, et al.
Published: (2026)
by: Li, Zida, et al.
Published: (2026)
Landscape More Secure Than Portrait? Zooming Into the Directionality of Digital Images With Security Implications
by: Lorch, Benedikt, et al.
Published: (2024)
by: Lorch, Benedikt, et al.
Published: (2024)
CipherDM: Secure Three-Party Inference for Diffusion Model Sampling
by: Zhao, Xin, et al.
Published: (2024)
by: Zhao, Xin, et al.
Published: (2024)
Modal Aphasia: Can Unified Multimodal Models Describe Images From Memory?
by: Aerni, Michael, et al.
Published: (2025)
by: Aerni, Michael, et al.
Published: (2025)
A Cross-Modal Prompt Injection Attack against Large Vision-Language Models with Image-Only Perturbation
by: Yang, Hao, et al.
Published: (2026)
by: Yang, Hao, et al.
Published: (2026)
Hidden Tail: Adversarial Image Causing Stealthy Resource Consumption in Vision-Language Models
by: Zhang, Rui, et al.
Published: (2025)
by: Zhang, Rui, et al.
Published: (2025)
Provably Secure Robust Image Steganography via Cross-Modal Error Correction
by: Qi, Yuang, et al.
Published: (2024)
by: Qi, Yuang, et al.
Published: (2024)
Anomaly Unveiled: Securing Image Classification against Adversarial Patch Attacks
by: Chattopadhyay, Nandish, et al.
Published: (2024)
by: Chattopadhyay, Nandish, et al.
Published: (2024)
Robust Provably Secure Image Steganography via Latent Iterative Optimization
by: Li, Yanan, et al.
Published: (2026)
by: Li, Yanan, et al.
Published: (2026)
VLATTACK: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models
by: Yin, Ziyi, et al.
Published: (2023)
by: Yin, Ziyi, et al.
Published: (2023)
Revisiting Data Auditing in Large Vision-Language Models
by: Zhu, Hongyu, et al.
Published: (2025)
by: Zhu, Hongyu, et al.
Published: (2025)
Jailbreaking Safeguarded Text-to-Image Models via Large Language Models
by: Jiang, Zhengyuan, et al.
Published: (2025)
by: Jiang, Zhengyuan, et al.
Published: (2025)
Query-Efficient Video Adversarial Attack with Stylized Logo
by: Tang, Duoxun, et al.
Published: (2024)
by: Tang, Duoxun, et al.
Published: (2024)
StyleFool: Fooling Video Classification Systems via Style Transfer
by: Cao, Yuxin, et al.
Published: (2022)
by: Cao, Yuxin, et al.
Published: (2022)
Federated Learning for Large Models in Medical Imaging: A Comprehensive Review
by: Sun, Mengyu, et al.
Published: (2025)
by: Sun, Mengyu, et al.
Published: (2025)
Image-Based Geolocation Using Large Vision-Language Models
by: Liu, Yi, et al.
Published: (2024)
by: Liu, Yi, et al.
Published: (2024)
SecureT2I: No More Unauthorized Manipulation on AI Generated Images from Prompts
by: Wu, Xiaodong, et al.
Published: (2025)
by: Wu, Xiaodong, et al.
Published: (2025)
Lightweight True In-Pixel Encryption with FeFET Enabled Pixel Design for Secure Imaging
by: Udoy, Md Rahatul Islam, et al.
Published: (2026)
by: Udoy, Md Rahatul Islam, et al.
Published: (2026)
FIDAVL: Fake Image Detection and Attribution using Vision-Language Model
by: Keita, Mamadou, et al.
Published: (2024)
by: Keita, Mamadou, et al.
Published: (2024)
Crafting Adversarial Inputs for Large Vision-Language Models Using Black-Box Optimization
by: Guan, Jiwei, et al.
Published: (2026)
by: Guan, Jiwei, et al.
Published: (2026)
Test-Time Attention Purification for Backdoored Large Vision Language Models
by: Zhang, Zhifang, et al.
Published: (2026)
by: Zhang, Zhifang, et al.
Published: (2026)
Harnessing the Power of Large Vision Language Models for Synthetic Image Detection
by: Keita, Mamadou, et al.
Published: (2024)
by: Keita, Mamadou, et al.
Published: (2024)
$\mathbf{S^2LM}$: Towards Semantic Steganography via Large Language Models
by: Wu, Huanqi, et al.
Published: (2025)
by: Wu, Huanqi, et al.
Published: (2025)
Progressive Feedback-Enhanced Transformer for Image Forgery Localization
by: Zhu, Haochen, et al.
Published: (2023)
by: Zhu, Haochen, et al.
Published: (2023)
SoK: Can Synthetic Images Replace Real Data? A Survey of Utility and Privacy of Synthetic Image Generation
by: Chung, Yunsung, et al.
Published: (2025)
by: Chung, Yunsung, et al.
Published: (2025)
Jailbreaking Attack against Multimodal Large Language Model
by: Niu, Zhenxing, et al.
Published: (2024)
by: Niu, Zhenxing, et al.
Published: (2024)
Security Risk of Misalignment between Text and Image in Multi-modal Model
by: Wang, Xiaosen, et al.
Published: (2025)
by: Wang, Xiaosen, et al.
Published: (2025)
Making Every Step Effective: Jailbreaking Large Vision-Language Models Through Hierarchical KV Equalization
by: Hao, Shuyang, et al.
Published: (2025)
by: Hao, Shuyang, et al.
Published: (2025)
DAVSP: Safety Alignment for Large Vision-Language Models via Deep Aligned Visual Safety Prompt
by: Zhang, Yitong, et al.
Published: (2025)
by: Zhang, Yitong, et al.
Published: (2025)
Probabilistic Modeling of Jailbreak on Multimodal LLMs: From Quantification to Application
by: Xu, Wenzhuo, et al.
Published: (2025)
by: Xu, Wenzhuo, et al.
Published: (2025)
Secure Seed-Based Multi-bit Watermarking for Diffusion Models from First Principles
by: Gesny, Enoal, et al.
Published: (2026)
by: Gesny, Enoal, et al.
Published: (2026)
FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion Models
by: Cui, Yingqian, et al.
Published: (2023)
by: Cui, Yingqian, et al.
Published: (2023)
Similar Items
-
LogoStyleFool: Vitiating Video Recognition Systems via Logo Style Transfer
by: Cao, Yuxin, et al.
Published: (2023) -
Secure and Robust Watermarking for AI-generated Images: A Comprehensive Survey
by: Cao, Jie, et al.
Published: (2025) -
Universally Unfiltered and Unseen:Input-Agnostic Multimodal Jailbreaks against Text-to-Image Model Safeguards
by: Yan, Song, et al.
Published: (2025) -
VideoSTF: Stress-Testing Output Repetition in Video Large Language Models
by: Cao, Yuxin, et al.
Published: (2026) -
A Survey of Safety on Large Vision-Language Models: Attacks, Defenses and Evaluations
by: Ye, Mang, et al.
Published: (2025)