:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Thakkar, Janvi, Zizzo, Giulio, Maffeis, Sergio
Format:	Preprint
Published:	2023
Subjects:	Machine Learning Cryptography and Security
Online Access:	https://arxiv.org/abs/2312.14260
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Differentially Private and Adversarially Robust Machine Learning: An Empirical Evaluation
by: Thakkar, Janvi, et al.
Published: (2024)

HarmLevelBench: Evaluating Harm-Level Compliance and the Impact of Quantization on Model Alignment
by: Belkhiter, Yannis, et al.
Published: (2024)

Blue Teaming Function-Calling Agents
by: Dolcetti, Greta, et al.
Published: (2026)

Breaking MCP with Function Hijacking Attacks: Novel Threats for Function Calling and Agentic Models
by: Belkhiter, Yannis, et al.
Published: (2026)

Towards a Practical Defense against Adversarial Attacks on Deep Learning-based Malware Detectors via Randomized Smoothing
by: Gibert, Daniel, et al.
Published: (2023)

Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs
by: Zizzo, Giulio, et al.
Published: (2025)

A Robust Defense against Adversarial Attacks on Deep Learning-based Malware Detectors via (De)Randomized Smoothing
by: Gibert, Daniel, et al.
Published: (2024)

IDEA: Invariant Defense for Graph Adversarial Robustness
by: Tao, Shuchang, et al.
Published: (2023)

MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt Attacks
by: Cornacchia, Giandomenico, et al.
Published: (2024)

Quantitative Resilience Modeling for Autonomous Cyber Defense
by: Cadet, Xavier, et al.
Published: (2025)

A No-Defense Defense Against Gradient-Based Adversarial Attacks on ML-NIDS: Is Less More?
by: elShehaby, Mohamed, et al.
Published: (2026)

Adversarial Suffix Filtering: a Defense Pipeline for LLMs
by: Khachaturov, David, et al.
Published: (2025)

Enhancing the "Immunity" of Mixture-of-Experts Networks for Adversarial Defense
by: Han, Qiao, et al.
Published: (2024)

Test-time Adversarial Defense with Opposite Adversarial Path and High Attack Time Cost
by: Yeh, Cheng-Han, et al.
Published: (2024)

A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models
by: Wu, Yihan, et al.
Published: (2023)

Provably Cost-Sensitive Adversarial Defense via Randomized Smoothing
by: Xin, Yuan, et al.
Published: (2023)

Pruning Graphs by Adversarial Robustness Evaluation to Strengthen GNN Defenses
by: Wang, Yongyu
Published: (2025)

Position: Towards Resilience Against Adversarial Examples
by: Dai, Sihui, et al.
Published: (2024)

FedBAP: Backdoor Defense via Benign Adversarial Perturbation in Federated Learning
by: Yan, Xinhai, et al.
Published: (2025)

Assessing the Resilience of Automotive Intrusion Detection Systems to Adversarial Manipulation
by: Longari, Stefano, et al.
Published: (2025)

Class-feature Watermark: A Resilient Black-box Watermark Against Model Extraction Attacks
by: Xiao, Yaxin, et al.
Published: (2025)

Signal Watermark on Large Language Models
by: Xu, Zhenyu, et al.
Published: (2024)

Certifiable Black-Box Attacks with Randomized Adversarial Examples: Breaking Defenses with Provable Confidence
by: Hong, Hanbin, et al.
Published: (2023)

One Stone, Two Birds: Enhancing Adversarial Defense Through the Lens of Distributional Discrepancy
by: Zhang, Jiacheng, et al.
Published: (2025)

Mitigating the Structural Bias in Graph Adversarial Defenses
by: Fang, Junyuan, et al.
Published: (2025)

Early Approaches to Adversarial Fine-Tuning for Prompt Injection Defense: A 2022 Study of GPT-3 and Contemporary Models
by: Sandoval, Gustavo, et al.
Published: (2025)

Development of an Edge Resilient ML Ensemble to Tolerate ICS Adversarial Attacks
by: Yao, Likai, et al.
Published: (2024)

Continual Adversarial Defense
by: Wang, Qian, et al.
Published: (2023)

Ideal Attribution and Faithful Watermarks for Language Models
by: Song, Min Jae, et al.
Published: (2025)

Black-Box Detection of Language Model Watermarks
by: Gloaguen, Thibaud, et al.
Published: (2024)

Generalization Properties of Adversarial Training for $\ell_0$-Bounded Adversarial Attacks
by: Delgosha, Payam, et al.
Published: (2024)

On the Effectiveness of Adversarial Training on Malware Classifiers
by: Bostani, Hamid, et al.
Published: (2024)

Robustness-Congruent Adversarial Training for Secure Machine Learning Model Updates
by: Angioni, Daniele, et al.
Published: (2024)

Secure and Private Federated Learning: Achieving Adversarial Resilience through Robust Aggregation
by: Yang, Kun, et al.
Published: (2025)

A Defensive Framework Against Adversarial Attacks on Machine Learning-Based Network Intrusion Detection Systems
by: Tafreshian, Benyamin, et al.
Published: (2025)

UnMarker: A Universal Attack on Defensive Image Watermarking
by: Kassis, Andre, et al.
Published: (2024)

Breaking Distortion-free Watermarks in Large Language Models
by: Reynolds, Shayleen, et al.
Published: (2025)

Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models
by: Zhang, Hanlin, et al.
Published: (2023)

Distortion-free Watermarks are not Truly Distortion-free under Watermark Key Collisions
by: Wu, Yihan, et al.
Published: (2024)

Mitigating Error Amplification in Fast Adversarial Training
by: Zhao, Mengnan, et al.
Published: (2026)