:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Yugeng, Li, Zheng, Huang, Hai, Backes, Michael, Zhang, Yang
Format:	Preprint
Published:	2025
Subjects:	Cryptography and Security
Online Access:	https://arxiv.org/abs/2506.18870
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Watermarking LLM-Generated Datasets in Downstream Tasks
by: Liu, Yugeng, et al.
Published: (2025)

$\texttt{ModSCAN}$: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities
by: Jiang, Yukun, et al.
Published: (2024)

JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs
by: Chu, Junjie, et al.
Published: (2024)

Robustness Over Time: Understanding Adversarial Examples' Effectiveness on Longitudinal Versions of Large Language Models
by: Liu, Yugeng, et al.
Published: (2023)

Composite Backdoor Attacks Against Large Language Models
by: Huang, Hai, et al.
Published: (2023)

Membership Inference Attacks Against In-Context Learning
by: Wen, Rui, et al.
Published: (2024)

Understanding Data Importance in Machine Learning Attacks: Does Valuable Data Pose Greater Harm?
by: Wen, Rui, et al.
Published: (2024)

SoK: Data Reconstruction Attacks Against Machine Learning Models: Definition, Metrics, and Benchmark
by: Wen, Rui, et al.
Published: (2025)

Transferable Availability Poisoning Attacks
by: Liu, Yiyong, et al.
Published: (2023)

Pop Quiz Attack: Black-box Membership Inference Attacks Against Large Language Models
by: Chen, Zeyuan, et al.
Published: (2026)

Excessive Reasoning Attack on Reasoning LLMs
by: Si, Wai Man, et al.
Published: (2025)

Vera Verto: Multimodal Hijacking Attack
by: Zhang, Minxing, et al.
Published: (2024)

Voice Jailbreak Attacks Against GPT-4o
by: Shen, Xinyue, et al.
Published: (2024)

Prompt Stealing Attacks Against Text-to-Image Generation Models
by: Shen, Xinyue, et al.
Published: (2023)

AttackPilot: Autonomous Inference Attacks Against ML Services With LLM-Based Agents
by: Wu, Yixin, et al.
Published: (2025)

Trust Me, Import This: Dependency Steering Attacks via Malicious Agent Skills
by: Liu, Yiyong, et al.
Published: (2026)

Invisibility Cloak: Disappearance under Human Pose Estimation via Backdoor Attacks
by: Zhang, Minxing, et al.
Published: (2024)

SOS! Soft Prompt Attack Against Open-Source Large Language Models
by: Yang, Ziqing, et al.
Published: (2024)

Efficient Data-Free Model Stealing with Label Diversity
by: Liu, Yiyong, et al.
Published: (2024)

Adjacent Words, Divergent Intents: Jailbreaking Large Language Models via Task Concurrency
by: Jiang, Yukun, et al.
Published: (2025)

Secure Composition of Robust and Optimising Compilers
by: Kruse, Matthis, et al.
Published: (2023)

Instruction Backdoor Attacks Against Customized LLMs
by: Zhang, Rui, et al.
Published: (2024)

BadBone: Backdoor Attacks Against Backbone Models in Visual Prompt Learning
by: Yang, Ziqing, et al.
Published: (2026)

Sparse Models, Sparse Safety: Unsafe Routes in Mixture-of-Experts LLMs
by: Jiang, Yukun, et al.
Published: (2026)

MGTBench: Benchmarking Machine-Generated Text Detection
by: He, Xinlei, et al.
Published: (2023)

ICLGuard: Controlling In-Context Learning Behavior for Applicability Authorization
by: Si, Wai Man, et al.
Published: (2024)

Bridging the Gap in Vision Language Models in Identifying Unsafe Concepts Across Modalities
by: Qu, Yiting, et al.
Published: (2025)

Link Stealing Attacks Against Inductive Graph Neural Networks
by: Wu, Yixin, et al.
Published: (2024)

Peering Behind the Shield: Guardrail Identification in Large Language Models
by: Yang, Ziqing, et al.
Published: (2025)

Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification
by: Zhang, Boyang, et al.
Published: (2024)

On the Proactive Generation of Unsafe Images From Text-To-Image Models Using Benign Prompts
by: Wu, Yixin, et al.
Published: (2023)

CAMH: Advancing Model Hijacking Attack in Machine Learning
by: He, Xing, et al.
Published: (2024)

Label Leakage Attacks in Machine Unlearning: A Parameter and Inversion-Based Approach
by: Zheng, Weidong, et al.
Published: (2026)

When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs
by: Shen, Xinyue, et al.
Published: (2025)

Image-Perfect Imperfections: Safety, Bias, and Authenticity in the Shadow of Text-To-Image Model Evolution
by: Wu, Yixin, et al.
Published: (2024)

Amplified Vulnerabilities: Structured Jailbreak Attacks on LLM-based Multi-Agent Debate
by: Qi, Senmao, et al.
Published: (2025)

Detecting Quishing Attacks with Machine Learning Techniques Through QR Code Analysis
by: Trad, Fouad, et al.
Published: (2025)

How Secure is Forgetting? Linking Machine Unlearning to Machine Learning Attacks
by: P., Muhammed Shafi K., et al.
Published: (2025)

A Comprehensive Review of Adversarial Attacks on Machine Learning
by: Ahmed, Syed Quiser, et al.
Published: (2024)

HarmfulSkillBench: How Do Harmful Skills Weaponize Your Agents?
by: Jiang, Yukun, et al.
Published: (2026)