Saved in:
| Main Authors: | Zhao, Wenzheng, Gadiputi, Madhava Kalyan, Yuan, Fengpei |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.03264 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Wukong Framework for Not Safe For Work Detection in Text-to-Image systems
by: Liu, Mingrui, et al.
Published: (2025)
by: Liu, Mingrui, et al.
Published: (2025)
SafeGuider: Robust and Practical Content Safety Control for Text-to-Image Models
by: Qi, Peigui, et al.
Published: (2025)
by: Qi, Peigui, et al.
Published: (2025)
Beyond Vulnerabilities: A Survey of Adversarial Attacks as Both Threats and Defenses in Computer Vision Systems
by: Guo, Zhongliang, et al.
Published: (2025)
by: Guo, Zhongliang, et al.
Published: (2025)
GuardTrace-VL: Detecting Unsafe Multimodel Reasoning via Iterative Safety Supervision
by: Xiang, Yuxiao, et al.
Published: (2025)
by: Xiang, Yuxiao, et al.
Published: (2025)
SteerDiff: Steering towards Safe Text-to-Image Diffusion Models
by: Zhang, Hongxiang, et al.
Published: (2024)
by: Zhang, Hongxiang, et al.
Published: (2024)
SafeVision: Efficient Image Guardrail with Robust Policy Adherence and Explainability
by: Xu, Peiyang, et al.
Published: (2025)
by: Xu, Peiyang, et al.
Published: (2025)
The Structural Safety Generalization Problem
by: Broomfield, Julius, et al.
Published: (2025)
by: Broomfield, Julius, et al.
Published: (2025)
Beyond the Safety Tax: Mitigating Unsafe Text-to-Image Generation via External Safety Rectification
by: Meng, Xiangtao, et al.
Published: (2025)
by: Meng, Xiangtao, et al.
Published: (2025)
X-SG$^2$S: Safe and Generalizable Gaussian Splatting with X-dimensional Watermarks
by: Cheng, Zihang, et al.
Published: (2025)
by: Cheng, Zihang, et al.
Published: (2025)
CPR: Retrieval Augmented Generation for Copyright Protection
by: Golatkar, Aditya, et al.
Published: (2024)
by: Golatkar, Aditya, et al.
Published: (2024)
An Experimental Study of Trojan Vulnerabilities in UAV Autonomous Landing
by: Ahmari, Reza, et al.
Published: (2025)
by: Ahmari, Reza, et al.
Published: (2025)
VideoEraser: Concept Erasure in Text-to-Video Diffusion Models
by: Xu, Naen, et al.
Published: (2025)
by: Xu, Naen, et al.
Published: (2025)
HomeSafe-Bench: Evaluating Vision-Language Models on Unsafe Action Detection for Embodied Agents in Household Scenarios
by: Pu, Jiayue, et al.
Published: (2026)
by: Pu, Jiayue, et al.
Published: (2026)
Rethinking and Red-Teaming Protective Perturbation in Personalized Diffusion Models
by: Liu, Yixin, et al.
Published: (2024)
by: Liu, Yixin, et al.
Published: (2024)
IdentityGuard: Context-Aware Restriction and Provenance for Personalized Synthesis
by: Zhang, Lingyun, et al.
Published: (2026)
by: Zhang, Lingyun, et al.
Published: (2026)
Watertox: The Art of Simplicity in Universal Attacks A Cross-Model Framework for Robust Adversarial Generation
by: Gao, Zhenghao, et al.
Published: (2024)
by: Gao, Zhenghao, et al.
Published: (2024)
Value-Aligned Prompt Moderation via Zero-Shot Agentic Rewriting for Safe Image Generation
by: Zhao, Xin, et al.
Published: (2025)
by: Zhao, Xin, et al.
Published: (2025)
Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models
by: Gao, Kuofeng, et al.
Published: (2025)
by: Gao, Kuofeng, et al.
Published: (2025)
VLM-Guard: Safeguarding Vision-Language Models via Fulfilling Safety Alignment Gap
by: Liu, Qin, et al.
Published: (2025)
by: Liu, Qin, et al.
Published: (2025)
PII-VisBench: Evaluating Personally Identifiable Information Safety in Vision Language Models Along a Continuum of Visibility
by: Shahariar, G M, et al.
Published: (2026)
by: Shahariar, G M, et al.
Published: (2026)
When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm
by: Leng, Ye, et al.
Published: (2026)
by: Leng, Ye, et al.
Published: (2026)
Physically Realizable Natural-Looking Clothing Textures Evade Person Detectors via 3D Modeling
by: Hu, Zhanhao, et al.
Published: (2023)
by: Hu, Zhanhao, et al.
Published: (2023)
T2I-RiskyPrompt: A Benchmark for Safety Evaluation, Attack, and Defense on Text-to-Image Model
by: Zhang, Chenyu, et al.
Published: (2025)
by: Zhang, Chenyu, et al.
Published: (2025)
Vulnerability analysis of captcha using Deep learning
by: Walia, Jaskaran Singh, et al.
Published: (2023)
by: Walia, Jaskaran Singh, et al.
Published: (2023)
Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses
by: Li, Xiao, et al.
Published: (2026)
by: Li, Xiao, et al.
Published: (2026)
Combating Falsification of Speech Videos with Live Optical Signatures (Extended Version)
by: Schwartz, Hadleigh, et al.
Published: (2025)
by: Schwartz, Hadleigh, et al.
Published: (2025)
GOTCHA: Real-Time Video Deepfake Detection via Challenge-Response
by: Mittal, Govind, et al.
Published: (2022)
by: Mittal, Govind, et al.
Published: (2022)
Unveiling the Potential: Harnessing Deep Metric Learning to Circumvent Video Streaming Encryption
by: Gansekoele, Arwin, et al.
Published: (2024)
by: Gansekoele, Arwin, et al.
Published: (2024)
Shaking the Fake: Detecting Deepfake Videos in Real Time via Active Probes
by: Xie, Zhixin, et al.
Published: (2024)
by: Xie, Zhixin, et al.
Published: (2024)
Architectural Neural Backdoors from First Principles
by: Langford, Harry, et al.
Published: (2024)
by: Langford, Harry, et al.
Published: (2024)
Representation Magnitude has a Liability to Privacy Vulnerability
by: Fang, Xingli, et al.
Published: (2024)
by: Fang, Xingli, et al.
Published: (2024)
Safety at Scale: A Comprehensive Survey of Large Model and Agent Safety
by: Ma, Xingjun, et al.
Published: (2025)
by: Ma, Xingjun, et al.
Published: (2025)
Improving Adversarial Training using Vulnerability-Aware Perturbation Budget
by: Fakorede, Olukorede, et al.
Published: (2024)
by: Fakorede, Olukorede, et al.
Published: (2024)
Refusing Safe Prompts for Multi-modal Large Language Models
by: Shao, Zedian, et al.
Published: (2024)
by: Shao, Zedian, et al.
Published: (2024)
SKeDA: A Generative Watermarking Framework for Text-to-video Diffusion Models
by: Yang, Yang, et al.
Published: (2026)
by: Yang, Yang, et al.
Published: (2026)
Spot Risks Before Speaking! Unraveling Safety Attention Heads in Large Vision-Language Models
by: Zheng, Ziwei, et al.
Published: (2025)
by: Zheng, Ziwei, et al.
Published: (2025)
FedPalm: A General Federated Learning Framework for Closed- and Open-Set Palmprint Verification
by: Yang, Ziyuan, et al.
Published: (2025)
by: Yang, Ziyuan, et al.
Published: (2025)
CipherDM: Secure Three-Party Inference for Diffusion Model Sampling
by: Zhao, Xin, et al.
Published: (2024)
by: Zhao, Xin, et al.
Published: (2024)
REFORGE: Multi-modal Attacks Reveal Vulnerable Concept Unlearning in Image Generation Models
by: Zou, Yong, et al.
Published: (2026)
by: Zou, Yong, et al.
Published: (2026)
SafeGen: Mitigating Sexually Explicit Content Generation in Text-to-Image Models
by: Li, Xinfeng, et al.
Published: (2024)
by: Li, Xinfeng, et al.
Published: (2024)
Similar Items
-
Wukong Framework for Not Safe For Work Detection in Text-to-Image systems
by: Liu, Mingrui, et al.
Published: (2025) -
SafeGuider: Robust and Practical Content Safety Control for Text-to-Image Models
by: Qi, Peigui, et al.
Published: (2025) -
Beyond Vulnerabilities: A Survey of Adversarial Attacks as Both Threats and Defenses in Computer Vision Systems
by: Guo, Zhongliang, et al.
Published: (2025) -
GuardTrace-VL: Detecting Unsafe Multimodel Reasoning via Iterative Safety Supervision
by: Xiang, Yuxiao, et al.
Published: (2025) -
SteerDiff: Steering towards Safe Text-to-Image Diffusion Models
by: Zhang, Hongxiang, et al.
Published: (2024)