Saved in:
| Main Authors: | Liu, Yugeng, Li, Zheng, Huang, Hai, Backes, Michael, Zhang, Yang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.18870 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Watermarking LLM-Generated Datasets in Downstream Tasks
by: Liu, Yugeng, et al.
Published: (2025)
by: Liu, Yugeng, et al.
Published: (2025)
$\texttt{ModSCAN}$: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities
by: Jiang, Yukun, et al.
Published: (2024)
by: Jiang, Yukun, et al.
Published: (2024)
JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs
by: Chu, Junjie, et al.
Published: (2024)
by: Chu, Junjie, et al.
Published: (2024)
Robustness Over Time: Understanding Adversarial Examples' Effectiveness on Longitudinal Versions of Large Language Models
by: Liu, Yugeng, et al.
Published: (2023)
by: Liu, Yugeng, et al.
Published: (2023)
Composite Backdoor Attacks Against Large Language Models
by: Huang, Hai, et al.
Published: (2023)
by: Huang, Hai, et al.
Published: (2023)
Membership Inference Attacks Against In-Context Learning
by: Wen, Rui, et al.
Published: (2024)
by: Wen, Rui, et al.
Published: (2024)
Understanding Data Importance in Machine Learning Attacks: Does Valuable Data Pose Greater Harm?
by: Wen, Rui, et al.
Published: (2024)
by: Wen, Rui, et al.
Published: (2024)
SoK: Data Reconstruction Attacks Against Machine Learning Models: Definition, Metrics, and Benchmark
by: Wen, Rui, et al.
Published: (2025)
by: Wen, Rui, et al.
Published: (2025)
Transferable Availability Poisoning Attacks
by: Liu, Yiyong, et al.
Published: (2023)
by: Liu, Yiyong, et al.
Published: (2023)
Pop Quiz Attack: Black-box Membership Inference Attacks Against Large Language Models
by: Chen, Zeyuan, et al.
Published: (2026)
by: Chen, Zeyuan, et al.
Published: (2026)
Excessive Reasoning Attack on Reasoning LLMs
by: Si, Wai Man, et al.
Published: (2025)
by: Si, Wai Man, et al.
Published: (2025)
Vera Verto: Multimodal Hijacking Attack
by: Zhang, Minxing, et al.
Published: (2024)
by: Zhang, Minxing, et al.
Published: (2024)
Voice Jailbreak Attacks Against GPT-4o
by: Shen, Xinyue, et al.
Published: (2024)
by: Shen, Xinyue, et al.
Published: (2024)
Prompt Stealing Attacks Against Text-to-Image Generation Models
by: Shen, Xinyue, et al.
Published: (2023)
by: Shen, Xinyue, et al.
Published: (2023)
AttackPilot: Autonomous Inference Attacks Against ML Services With LLM-Based Agents
by: Wu, Yixin, et al.
Published: (2025)
by: Wu, Yixin, et al.
Published: (2025)
Trust Me, Import This: Dependency Steering Attacks via Malicious Agent Skills
by: Liu, Yiyong, et al.
Published: (2026)
by: Liu, Yiyong, et al.
Published: (2026)
Invisibility Cloak: Disappearance under Human Pose Estimation via Backdoor Attacks
by: Zhang, Minxing, et al.
Published: (2024)
by: Zhang, Minxing, et al.
Published: (2024)
SOS! Soft Prompt Attack Against Open-Source Large Language Models
by: Yang, Ziqing, et al.
Published: (2024)
by: Yang, Ziqing, et al.
Published: (2024)
Efficient Data-Free Model Stealing with Label Diversity
by: Liu, Yiyong, et al.
Published: (2024)
by: Liu, Yiyong, et al.
Published: (2024)
Adjacent Words, Divergent Intents: Jailbreaking Large Language Models via Task Concurrency
by: Jiang, Yukun, et al.
Published: (2025)
by: Jiang, Yukun, et al.
Published: (2025)
Secure Composition of Robust and Optimising Compilers
by: Kruse, Matthis, et al.
Published: (2023)
by: Kruse, Matthis, et al.
Published: (2023)
Instruction Backdoor Attacks Against Customized LLMs
by: Zhang, Rui, et al.
Published: (2024)
by: Zhang, Rui, et al.
Published: (2024)
BadBone: Backdoor Attacks Against Backbone Models in Visual Prompt Learning
by: Yang, Ziqing, et al.
Published: (2026)
by: Yang, Ziqing, et al.
Published: (2026)
Sparse Models, Sparse Safety: Unsafe Routes in Mixture-of-Experts LLMs
by: Jiang, Yukun, et al.
Published: (2026)
by: Jiang, Yukun, et al.
Published: (2026)
MGTBench: Benchmarking Machine-Generated Text Detection
by: He, Xinlei, et al.
Published: (2023)
by: He, Xinlei, et al.
Published: (2023)
ICLGuard: Controlling In-Context Learning Behavior for Applicability Authorization
by: Si, Wai Man, et al.
Published: (2024)
by: Si, Wai Man, et al.
Published: (2024)
Bridging the Gap in Vision Language Models in Identifying Unsafe Concepts Across Modalities
by: Qu, Yiting, et al.
Published: (2025)
by: Qu, Yiting, et al.
Published: (2025)
Link Stealing Attacks Against Inductive Graph Neural Networks
by: Wu, Yixin, et al.
Published: (2024)
by: Wu, Yixin, et al.
Published: (2024)
Peering Behind the Shield: Guardrail Identification in Large Language Models
by: Yang, Ziqing, et al.
Published: (2025)
by: Yang, Ziqing, et al.
Published: (2025)
Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification
by: Zhang, Boyang, et al.
Published: (2024)
by: Zhang, Boyang, et al.
Published: (2024)
On the Proactive Generation of Unsafe Images From Text-To-Image Models Using Benign Prompts
by: Wu, Yixin, et al.
Published: (2023)
by: Wu, Yixin, et al.
Published: (2023)
CAMH: Advancing Model Hijacking Attack in Machine Learning
by: He, Xing, et al.
Published: (2024)
by: He, Xing, et al.
Published: (2024)
Label Leakage Attacks in Machine Unlearning: A Parameter and Inversion-Based Approach
by: Zheng, Weidong, et al.
Published: (2026)
by: Zheng, Weidong, et al.
Published: (2026)
When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs
by: Shen, Xinyue, et al.
Published: (2025)
by: Shen, Xinyue, et al.
Published: (2025)
Image-Perfect Imperfections: Safety, Bias, and Authenticity in the Shadow of Text-To-Image Model Evolution
by: Wu, Yixin, et al.
Published: (2024)
by: Wu, Yixin, et al.
Published: (2024)
Amplified Vulnerabilities: Structured Jailbreak Attacks on LLM-based Multi-Agent Debate
by: Qi, Senmao, et al.
Published: (2025)
by: Qi, Senmao, et al.
Published: (2025)
Detecting Quishing Attacks with Machine Learning Techniques Through QR Code Analysis
by: Trad, Fouad, et al.
Published: (2025)
by: Trad, Fouad, et al.
Published: (2025)
How Secure is Forgetting? Linking Machine Unlearning to Machine Learning Attacks
by: P., Muhammed Shafi K., et al.
Published: (2025)
by: P., Muhammed Shafi K., et al.
Published: (2025)
A Comprehensive Review of Adversarial Attacks on Machine Learning
by: Ahmed, Syed Quiser, et al.
Published: (2024)
by: Ahmed, Syed Quiser, et al.
Published: (2024)
HarmfulSkillBench: How Do Harmful Skills Weaponize Your Agents?
by: Jiang, Yukun, et al.
Published: (2026)
by: Jiang, Yukun, et al.
Published: (2026)
Similar Items
-
Watermarking LLM-Generated Datasets in Downstream Tasks
by: Liu, Yugeng, et al.
Published: (2025) -
$\texttt{ModSCAN}$: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities
by: Jiang, Yukun, et al.
Published: (2024) -
JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs
by: Chu, Junjie, et al.
Published: (2024) -
Robustness Over Time: Understanding Adversarial Examples' Effectiveness on Longitudinal Versions of Large Language Models
by: Liu, Yugeng, et al.
Published: (2023) -
Composite Backdoor Attacks Against Large Language Models
by: Huang, Hai, et al.
Published: (2023)