:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Fan, Yihe, Cao, Yuxin, Zhao, Ziyu, Liu, Ziyao, Li, Shaofeng
Format:	Preprint
Published:	2024
Subjects:	Cryptography and Security Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2404.05264
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

LogoStyleFool: Vitiating Video Recognition Systems via Logo Style Transfer
by: Cao, Yuxin, et al.
Published: (2023)

Secure and Robust Watermarking for AI-generated Images: A Comprehensive Survey
by: Cao, Jie, et al.
Published: (2025)

Universally Unfiltered and Unseen:Input-Agnostic Multimodal Jailbreaks against Text-to-Image Model Safeguards
by: Yan, Song, et al.
Published: (2025)

VideoSTF: Stress-Testing Output Repetition in Video Large Language Models
by: Cao, Yuxin, et al.
Published: (2026)

A Survey of Safety on Large Vision-Language Models: Attacks, Defenses and Evaluations
by: Ye, Mang, et al.
Published: (2025)

Evaluating the Efficacy of Prompt-Engineered Large Multimodal Models Versus Fine-Tuned Vision Transformers in Image-Based Security Applications
by: Trad, Fouad, et al.
Published: (2024)

Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images
by: Gao, Kuofeng, et al.
Published: (2024)

Image Corruption-Inspired Membership Inference Attacks against Large Vision-Language Models
by: Wu, Zongyu, et al.
Published: (2025)

Scaling Exposes the Trigger: Input-Level Backdoor Detection in Text-to-Image Diffusion Models via Cross-Attention Scaling
by: Li, Zida, et al.
Published: (2026)

Landscape More Secure Than Portrait? Zooming Into the Directionality of Digital Images With Security Implications
by: Lorch, Benedikt, et al.
Published: (2024)

CipherDM: Secure Three-Party Inference for Diffusion Model Sampling
by: Zhao, Xin, et al.
Published: (2024)

Modal Aphasia: Can Unified Multimodal Models Describe Images From Memory?
by: Aerni, Michael, et al.
Published: (2025)

A Cross-Modal Prompt Injection Attack against Large Vision-Language Models with Image-Only Perturbation
by: Yang, Hao, et al.
Published: (2026)

Hidden Tail: Adversarial Image Causing Stealthy Resource Consumption in Vision-Language Models
by: Zhang, Rui, et al.
Published: (2025)

Provably Secure Robust Image Steganography via Cross-Modal Error Correction
by: Qi, Yuang, et al.
Published: (2024)

Anomaly Unveiled: Securing Image Classification against Adversarial Patch Attacks
by: Chattopadhyay, Nandish, et al.
Published: (2024)

Robust Provably Secure Image Steganography via Latent Iterative Optimization
by: Li, Yanan, et al.
Published: (2026)

VLATTACK: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models
by: Yin, Ziyi, et al.
Published: (2023)

Revisiting Data Auditing in Large Vision-Language Models
by: Zhu, Hongyu, et al.
Published: (2025)

Jailbreaking Safeguarded Text-to-Image Models via Large Language Models
by: Jiang, Zhengyuan, et al.
Published: (2025)

Query-Efficient Video Adversarial Attack with Stylized Logo
by: Tang, Duoxun, et al.
Published: (2024)

StyleFool: Fooling Video Classification Systems via Style Transfer
by: Cao, Yuxin, et al.
Published: (2022)

Federated Learning for Large Models in Medical Imaging: A Comprehensive Review
by: Sun, Mengyu, et al.
Published: (2025)

Image-Based Geolocation Using Large Vision-Language Models
by: Liu, Yi, et al.
Published: (2024)

SecureT2I: No More Unauthorized Manipulation on AI Generated Images from Prompts
by: Wu, Xiaodong, et al.
Published: (2025)

Lightweight True In-Pixel Encryption with FeFET Enabled Pixel Design for Secure Imaging
by: Udoy, Md Rahatul Islam, et al.
Published: (2026)

FIDAVL: Fake Image Detection and Attribution using Vision-Language Model
by: Keita, Mamadou, et al.
Published: (2024)

Crafting Adversarial Inputs for Large Vision-Language Models Using Black-Box Optimization
by: Guan, Jiwei, et al.
Published: (2026)

Test-Time Attention Purification for Backdoored Large Vision Language Models
by: Zhang, Zhifang, et al.
Published: (2026)

Harnessing the Power of Large Vision Language Models for Synthetic Image Detection
by: Keita, Mamadou, et al.
Published: (2024)

$\mathbf{S^2LM}$: Towards Semantic Steganography via Large Language Models
by: Wu, Huanqi, et al.
Published: (2025)

Progressive Feedback-Enhanced Transformer for Image Forgery Localization
by: Zhu, Haochen, et al.
Published: (2023)

SoK: Can Synthetic Images Replace Real Data? A Survey of Utility and Privacy of Synthetic Image Generation
by: Chung, Yunsung, et al.
Published: (2025)

Jailbreaking Attack against Multimodal Large Language Model
by: Niu, Zhenxing, et al.
Published: (2024)

Security Risk of Misalignment between Text and Image in Multi-modal Model
by: Wang, Xiaosen, et al.
Published: (2025)

Making Every Step Effective: Jailbreaking Large Vision-Language Models Through Hierarchical KV Equalization
by: Hao, Shuyang, et al.
Published: (2025)

DAVSP: Safety Alignment for Large Vision-Language Models via Deep Aligned Visual Safety Prompt
by: Zhang, Yitong, et al.
Published: (2025)

Probabilistic Modeling of Jailbreak on Multimodal LLMs: From Quantification to Application
by: Xu, Wenzhuo, et al.
Published: (2025)

Secure Seed-Based Multi-bit Watermarking for Diffusion Models from First Principles
by: Gesny, Enoal, et al.
Published: (2026)

FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion Models
by: Cui, Yingqian, et al.
Published: (2023)