Saved in:
| Main Authors: | Pathade, Chetan, Patil, Shubham |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.07188 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
by: Pathade, Chetan
Published: (2025)
by: Pathade, Chetan
Published: (2025)
Invisible Injections: Exploiting Vision-Language Models Through Steganographic Prompt Embedding
by: Pathade, Chetan
Published: (2025)
by: Pathade, Chetan
Published: (2025)
Exposing Hidden Backdoors in NFT Smart Contracts: A Static Security Analysis of Rug Pull Patterns
by: Pathade, Chetan, et al.
Published: (2025)
by: Pathade, Chetan, et al.
Published: (2025)
Serverless AI Security: Attack Surface Analysis and Runtime Protection Mechanisms for FaaS-Based Machine Learning
by: Pathade, Chetan, et al.
Published: (2026)
by: Pathade, Chetan, et al.
Published: (2026)
Membership Inference Attacks Against In-Context Learning
by: Wen, Rui, et al.
Published: (2024)
by: Wen, Rui, et al.
Published: (2024)
Defending LLM Watermarking Against Spoofing Attacks with Contrastive Representation Learning
by: An, Li, et al.
Published: (2025)
by: An, Li, et al.
Published: (2025)
SCOUT: A Defense Against Data Poisoning Attacks in Fine-Tuned Language Models
by: Afane, Mohamed, et al.
Published: (2025)
by: Afane, Mohamed, et al.
Published: (2025)
SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It)
by: Meeus, Matthieu, et al.
Published: (2024)
by: Meeus, Matthieu, et al.
Published: (2024)
Stop Tracking Me! Proactive Defense Against Attribute Inference Attack in LLMs
by: Yan, Dong, et al.
Published: (2026)
by: Yan, Dong, et al.
Published: (2026)
Prompt Stealing Attacks Against Large Language Models
by: Sha, Zeyang, et al.
Published: (2024)
by: Sha, Zeyang, et al.
Published: (2024)
Window-based Membership Inference Attacks Against Fine-tuned Large Language Models
by: Chen, Yuetian, et al.
Published: (2026)
by: Chen, Yuetian, et al.
Published: (2026)
Tab-MIA: A Benchmark Dataset for Membership Inference Attacks on Tabular Data in LLMs
by: German, Eyal, et al.
Published: (2025)
by: German, Eyal, et al.
Published: (2025)
MPMA: Preference Manipulation Attack Against Model Context Protocol
by: Wang, Zihan, et al.
Published: (2025)
by: Wang, Zihan, et al.
Published: (2025)
Security Attacks on LLM-based Code Completion Tools
by: Cheng, Wen, et al.
Published: (2024)
by: Cheng, Wen, et al.
Published: (2024)
SecurityLingua: Efficient Defense of LLM Jailbreak Attacks via Security-Aware Prompt Compression
by: Li, Yucheng, et al.
Published: (2025)
by: Li, Yucheng, et al.
Published: (2025)
Hidden Data Privacy Breaches in Federated Learning
by: Gong, Xueluan, et al.
Published: (2024)
by: Gong, Xueluan, et al.
Published: (2024)
LoRA-Leak: Membership Inference Attacks Against LoRA Fine-tuned Language Models
by: Ran, Delong, et al.
Published: (2025)
by: Ran, Delong, et al.
Published: (2025)
Enhance Robustness of Language Models Against Variation Attack through Graph Integration
by: Xiong, Zi, et al.
Published: (2024)
by: Xiong, Zi, et al.
Published: (2024)
Tag&Tab: Pretraining Data Detection in Large Language Models Using Keyword-Based Membership Inference Attack
by: Antebi, Sagiv, et al.
Published: (2025)
by: Antebi, Sagiv, et al.
Published: (2025)
Can Federated Learning Safeguard Private Data in LLM Training? Vulnerabilities, Attacks, and Defense Evaluation
by: Guo, Wenkai, et al.
Published: (2025)
by: Guo, Wenkai, et al.
Published: (2025)
Is Your Prompt Safe? Investigating Prompt Injection Attacks Against Open-Source LLMs
by: Wang, Jiawen, et al.
Published: (2025)
by: Wang, Jiawen, et al.
Published: (2025)
Iron Sharpens Iron: Defending Against Attacks in Machine-Generated Text Detection with Adversarial Training
by: Li, Yuanfan, et al.
Published: (2025)
by: Li, Yuanfan, et al.
Published: (2025)
Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks
by: Chen, Yiyi, et al.
Published: (2024)
by: Chen, Yiyi, et al.
Published: (2024)
SecureGate: Learning When to Reveal PII Safely via Token-Gated Dual-Adapters for Federated LLMs
by: Shaaban, Mohamed, et al.
Published: (2026)
by: Shaaban, Mohamed, et al.
Published: (2026)
Bileve: Securing Text Provenance in Large Language Models Against Spoofing with Bi-level Signature
by: Zhou, Tong, et al.
Published: (2024)
by: Zhou, Tong, et al.
Published: (2024)
DINA: A Dual Defense Framework Against Internal Noise and External Attacks in Natural Language Processing
by: Chuang, Ko-Wei, et al.
Published: (2025)
by: Chuang, Ko-Wei, et al.
Published: (2025)
Safely Learning with Private Data: A Federated Learning Framework for Large Language Model
by: Zheng, JiaYing, et al.
Published: (2024)
by: Zheng, JiaYing, et al.
Published: (2024)
FlexLLM: Exploring LLM Customization for Moving Target Defense on Black-Box LLMs Against Jailbreak Attacks
by: Chen, Bocheng, et al.
Published: (2024)
by: Chen, Bocheng, et al.
Published: (2024)
Defending Against Indirect Prompt Injection Attacks With Spotlighting
by: Hines, Keegan, et al.
Published: (2024)
by: Hines, Keegan, et al.
Published: (2024)
Adversarial Tuning: Defending Against Jailbreak Attacks for LLMs
by: Liu, Fan, et al.
Published: (2024)
by: Liu, Fan, et al.
Published: (2024)
Composite Backdoor Attacks Against Large Language Models
by: Huang, Hai, et al.
Published: (2023)
by: Huang, Hai, et al.
Published: (2023)
Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models
by: He, Yu, et al.
Published: (2025)
by: He, Yu, et al.
Published: (2025)
LLMGuard: Guarding Against Unsafe LLM Behavior
by: Goyal, Shubh, et al.
Published: (2024)
by: Goyal, Shubh, et al.
Published: (2024)
Enabling Efficient Attack Investigation via Human-in-the-Loop Security Analysis
by: Tsegai, Saimon Amanuel, et al.
Published: (2022)
by: Tsegai, Saimon Amanuel, et al.
Published: (2022)
Adversarial Attacks Against Automated Fact-Checking: A Survey
by: Liu, Fanzhen, et al.
Published: (2025)
by: Liu, Fanzhen, et al.
Published: (2025)
Self-Evaluation as a Defense Against Adversarial Attacks on LLMs
by: Brown, Hannah, et al.
Published: (2024)
by: Brown, Hannah, et al.
Published: (2024)
SecureLLM: Using Compositionality to Build Provably Secure Language Models for Private, Sensitive, and Secret Data
by: Alabdulkareem, Abdulrahman, et al.
Published: (2024)
by: Alabdulkareem, Abdulrahman, et al.
Published: (2024)
User Inference Attacks on Large Language Models
by: Kandpal, Nikhil, et al.
Published: (2023)
by: Kandpal, Nikhil, et al.
Published: (2023)
Membership Inference Attacks and Privacy in Topic Modeling
by: Manzonelli, Nico, et al.
Published: (2024)
by: Manzonelli, Nico, et al.
Published: (2024)
$PD^3F$: A Pluggable and Dynamic DoS-Defense Framework Against Resource Consumption Attacks Targeting Large Language Models
by: Zhang, Yuanhe, et al.
Published: (2025)
by: Zhang, Yuanhe, et al.
Published: (2025)
Similar Items
-
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
by: Pathade, Chetan
Published: (2025) -
Invisible Injections: Exploiting Vision-Language Models Through Steganographic Prompt Embedding
by: Pathade, Chetan
Published: (2025) -
Exposing Hidden Backdoors in NFT Smart Contracts: A Static Security Analysis of Rug Pull Patterns
by: Pathade, Chetan, et al.
Published: (2025) -
Serverless AI Security: Attack Surface Analysis and Runtime Protection Mechanisms for FaaS-Based Machine Learning
by: Pathade, Chetan, et al.
Published: (2026) -
Membership Inference Attacks Against In-Context Learning
by: Wen, Rui, et al.
Published: (2024)