:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Hui, Bo, Yuan, Haolin, Gong, Neil, Burlina, Philippe, Cao, Yinzhi
Formato:	Preprint
Publicado:	2024
Materias:	Cryptography and Security Artificial Intelligence Machine Learning
Acceso en línea:	https://arxiv.org/abs/2405.06823
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

A Survey on Model Extraction Attacks and Defenses for Large Language Models
por: Zhao, Kaixiang, et al.
Publicado: (2025)

Mirror Mirror on the Wall, Have I Forgotten it All? A New Framework for Evaluating Machine Unlearning
por: Brimhall, Brennon, et al.
Publicado: (2025)

LeakSealer: A Semisupervised Defense for LLMs Against Prompt Injection and Leakage Attacks
por: Panebianco, Francesco, et al.
Publicado: (2025)

Prompt Injection Attacks on Large Language Models in Oncology
por: Clusmann, Jan, et al.
Publicado: (2024)

CHAI: Command Hijacking against embodied AI
por: Burbano, Luis, et al.
Publicado: (2025)

Formalizing and Benchmarking Prompt Injection Attacks and Defenses
por: Liu, Yupei, et al.
Publicado: (2023)

Enhancing Prompt Injection Attacks to LLMs via Poisoning Alignment
por: Shao, Zedian, et al.
Publicado: (2024)

Refusing Safe Prompts for Multi-modal Large Language Models
por: Shao, Zedian, et al.
Publicado: (2024)

Towards Lifecycle Unlearning Commitment Management: Measuring Sample-level Unlearning Completeness
por: Wang, Cheng-Long, et al.
Publicado: (2025)

A Survey of Model Extraction Attacks and Defenses in Distributed Computing Environments
por: Zhao, Kaixiang, et al.
Publicado: (2025)

A Systematic Survey of Model Extraction Attacks and Defenses: State-of-the-Art and Perspectives
por: Zhao, Kaixiang, et al.
Publicado: (2025)

Recalling The Forgotten Class Memberships: Unlearned Models Can Be Noisy Labelers to Leak Privacy
por: Sui, Zhihao, et al.
Publicado: (2025)

TrojFM: Resource-efficient Backdoor Attacks against Very Large Foundation Models
por: Nie, Yuzhou., et al.
Publicado: (2024)

Your Agent Can Defend Itself against Backdoor Attacks
por: Changjiang, Li, et al.
Publicado: (2025)

RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
por: Yuan, Zhuowen, et al.
Publicado: (2024)

Measuring Real-World Prompt Injection Attacks in LLM-based Resume Screening
por: Zhang, Mohan, et al.
Publicado: (2026)

Context-Aware Membership Inference Attacks against Pre-trained Large Language Models
por: Chang, Hongyan, et al.
Publicado: (2024)

Attacking LLMs and AI Agents: Advertisement Embedding Attacks Against Large Language Models
por: Guo, Qiming, et al.
Publicado: (2025)

Faster-GCG: Efficient Discrete Optimization Jailbreak Attacks against Aligned Large Language Models
por: Li, Xiao, et al.
Publicado: (2024)

Backdoor Attack against One-Class Sequential Anomaly Detection Models
por: Cheng, He, et al.
Publicado: (2024)

AudioJailbreak: Jailbreak Attacks against End-to-End Large Audio-Language Models
por: Chen, Guangke, et al.
Publicado: (2025)

Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection
por: Shao, Zedian, et al.
Publicado: (2026)

A Critical Evaluation of Defenses against Prompt Injection Attacks
por: Jia, Yuqi, et al.
Publicado: (2025)

Prompt, Divide, and Conquer: Bypassing Large Language Model Safety Filters via Segmented and Distributed Prompt Processing
por: Wahréus, Johan, et al.
Publicado: (2025)

Model Inversion Attacks on Llama 3: Extracting PII from Large Language Models
por: Sivashanmugam, Sathesh P.
Publicado: (2025)

Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey
por: Huang, Tiansheng, et al.
Publicado: (2024)

Frequency-Domain Regularized Adversarial Alignment for Transferable Attacks against Closed-Source MLLMs
por: Yuan, Leitao, et al.
Publicado: (2026)

Unlocking Memorization in Large Language Models with Dynamic Soft Prompting
por: Wang, Zhepeng, et al.
Publicado: (2024)

Concealing Backdoor Model Updates in Federated Learning by Trigger-Optimized Data Poisoning
por: Zhang, Yujie, et al.
Publicado: (2024)

From Data Leak to Secret Misses: The Impact of Data Leakage on Secret Detection Models
por: Soltaniani, Farnaz, et al.
Publicado: (2026)

Arondight: Red Teaming Large Vision Language Models with Auto-generated Multi-modal Jailbreak Prompts
por: Liu, Yi, et al.
Publicado: (2024)

Jailbreaking Safeguarded Text-to-Image Models via Large Language Models
por: Jiang, Zhengyuan, et al.
Publicado: (2025)

Pharmacist: Safety Alignment Data Curation for Large Language Models against Harmful Fine-tuning
por: Liu, Guozhi, et al.
Publicado: (2025)

Turning Black Box into White Box: Dataset Distillation Leaks
por: Chen, Huajie, et al.
Publicado: (2026)

Large Language Models in Cybersecurity: Applications, Vulnerabilities, and Defense Techniques
por: Jaffal, Niveen O., et al.
Publicado: (2025)

Attention Tracker: Detecting Prompt Injection Attacks in LLMs
por: Hung, Kuo-Han, et al.
Publicado: (2024)

Verification of Bit-Flip Attacks against Quantized Neural Networks
por: Zhang, Yedi, et al.
Publicado: (2025)

Client-Side Patching against Backdoor Attacks in Federated Learning
por: Molina-Coronado, Borja
Publicado: (2024)

The Application of Transformer-Based Models for Predicting Consequences of Cyber Attacks
por: Chhetri, Bipin, et al.
Publicado: (2025)

Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack
por: Yue, Murong, et al.
Publicado: (2025)