Saved in:
| Main Authors: | Thakkar, Janvi, Zizzo, Giulio, Maffeis, Sergio |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2312.14260 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Differentially Private and Adversarially Robust Machine Learning: An Empirical Evaluation
by: Thakkar, Janvi, et al.
Published: (2024)
by: Thakkar, Janvi, et al.
Published: (2024)
HarmLevelBench: Evaluating Harm-Level Compliance and the Impact of Quantization on Model Alignment
by: Belkhiter, Yannis, et al.
Published: (2024)
by: Belkhiter, Yannis, et al.
Published: (2024)
Blue Teaming Function-Calling Agents
by: Dolcetti, Greta, et al.
Published: (2026)
by: Dolcetti, Greta, et al.
Published: (2026)
Breaking MCP with Function Hijacking Attacks: Novel Threats for Function Calling and Agentic Models
by: Belkhiter, Yannis, et al.
Published: (2026)
by: Belkhiter, Yannis, et al.
Published: (2026)
Towards a Practical Defense against Adversarial Attacks on Deep Learning-based Malware Detectors via Randomized Smoothing
by: Gibert, Daniel, et al.
Published: (2023)
by: Gibert, Daniel, et al.
Published: (2023)
Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs
by: Zizzo, Giulio, et al.
Published: (2025)
by: Zizzo, Giulio, et al.
Published: (2025)
A Robust Defense against Adversarial Attacks on Deep Learning-based Malware Detectors via (De)Randomized Smoothing
by: Gibert, Daniel, et al.
Published: (2024)
by: Gibert, Daniel, et al.
Published: (2024)
IDEA: Invariant Defense for Graph Adversarial Robustness
by: Tao, Shuchang, et al.
Published: (2023)
by: Tao, Shuchang, et al.
Published: (2023)
MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt Attacks
by: Cornacchia, Giandomenico, et al.
Published: (2024)
by: Cornacchia, Giandomenico, et al.
Published: (2024)
Quantitative Resilience Modeling for Autonomous Cyber Defense
by: Cadet, Xavier, et al.
Published: (2025)
by: Cadet, Xavier, et al.
Published: (2025)
A No-Defense Defense Against Gradient-Based Adversarial Attacks on ML-NIDS: Is Less More?
by: elShehaby, Mohamed, et al.
Published: (2026)
by: elShehaby, Mohamed, et al.
Published: (2026)
Adversarial Suffix Filtering: a Defense Pipeline for LLMs
by: Khachaturov, David, et al.
Published: (2025)
by: Khachaturov, David, et al.
Published: (2025)
Enhancing the "Immunity" of Mixture-of-Experts Networks for Adversarial Defense
by: Han, Qiao, et al.
Published: (2024)
by: Han, Qiao, et al.
Published: (2024)
Test-time Adversarial Defense with Opposite Adversarial Path and High Attack Time Cost
by: Yeh, Cheng-Han, et al.
Published: (2024)
by: Yeh, Cheng-Han, et al.
Published: (2024)
A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models
by: Wu, Yihan, et al.
Published: (2023)
by: Wu, Yihan, et al.
Published: (2023)
Provably Cost-Sensitive Adversarial Defense via Randomized Smoothing
by: Xin, Yuan, et al.
Published: (2023)
by: Xin, Yuan, et al.
Published: (2023)
Pruning Graphs by Adversarial Robustness Evaluation to Strengthen GNN Defenses
by: Wang, Yongyu
Published: (2025)
by: Wang, Yongyu
Published: (2025)
Position: Towards Resilience Against Adversarial Examples
by: Dai, Sihui, et al.
Published: (2024)
by: Dai, Sihui, et al.
Published: (2024)
FedBAP: Backdoor Defense via Benign Adversarial Perturbation in Federated Learning
by: Yan, Xinhai, et al.
Published: (2025)
by: Yan, Xinhai, et al.
Published: (2025)
Assessing the Resilience of Automotive Intrusion Detection Systems to Adversarial Manipulation
by: Longari, Stefano, et al.
Published: (2025)
by: Longari, Stefano, et al.
Published: (2025)
Class-feature Watermark: A Resilient Black-box Watermark Against Model Extraction Attacks
by: Xiao, Yaxin, et al.
Published: (2025)
by: Xiao, Yaxin, et al.
Published: (2025)
Signal Watermark on Large Language Models
by: Xu, Zhenyu, et al.
Published: (2024)
by: Xu, Zhenyu, et al.
Published: (2024)
Certifiable Black-Box Attacks with Randomized Adversarial Examples: Breaking Defenses with Provable Confidence
by: Hong, Hanbin, et al.
Published: (2023)
by: Hong, Hanbin, et al.
Published: (2023)
One Stone, Two Birds: Enhancing Adversarial Defense Through the Lens of Distributional Discrepancy
by: Zhang, Jiacheng, et al.
Published: (2025)
by: Zhang, Jiacheng, et al.
Published: (2025)
Mitigating the Structural Bias in Graph Adversarial Defenses
by: Fang, Junyuan, et al.
Published: (2025)
by: Fang, Junyuan, et al.
Published: (2025)
Early Approaches to Adversarial Fine-Tuning for Prompt Injection Defense: A 2022 Study of GPT-3 and Contemporary Models
by: Sandoval, Gustavo, et al.
Published: (2025)
by: Sandoval, Gustavo, et al.
Published: (2025)
Development of an Edge Resilient ML Ensemble to Tolerate ICS Adversarial Attacks
by: Yao, Likai, et al.
Published: (2024)
by: Yao, Likai, et al.
Published: (2024)
Continual Adversarial Defense
by: Wang, Qian, et al.
Published: (2023)
by: Wang, Qian, et al.
Published: (2023)
Ideal Attribution and Faithful Watermarks for Language Models
by: Song, Min Jae, et al.
Published: (2025)
by: Song, Min Jae, et al.
Published: (2025)
Black-Box Detection of Language Model Watermarks
by: Gloaguen, Thibaud, et al.
Published: (2024)
by: Gloaguen, Thibaud, et al.
Published: (2024)
Generalization Properties of Adversarial Training for $\ell_0$-Bounded Adversarial Attacks
by: Delgosha, Payam, et al.
Published: (2024)
by: Delgosha, Payam, et al.
Published: (2024)
On the Effectiveness of Adversarial Training on Malware Classifiers
by: Bostani, Hamid, et al.
Published: (2024)
by: Bostani, Hamid, et al.
Published: (2024)
Robustness-Congruent Adversarial Training for Secure Machine Learning Model Updates
by: Angioni, Daniele, et al.
Published: (2024)
by: Angioni, Daniele, et al.
Published: (2024)
Secure and Private Federated Learning: Achieving Adversarial Resilience through Robust Aggregation
by: Yang, Kun, et al.
Published: (2025)
by: Yang, Kun, et al.
Published: (2025)
A Defensive Framework Against Adversarial Attacks on Machine Learning-Based Network Intrusion Detection Systems
by: Tafreshian, Benyamin, et al.
Published: (2025)
by: Tafreshian, Benyamin, et al.
Published: (2025)
UnMarker: A Universal Attack on Defensive Image Watermarking
by: Kassis, Andre, et al.
Published: (2024)
by: Kassis, Andre, et al.
Published: (2024)
Breaking Distortion-free Watermarks in Large Language Models
by: Reynolds, Shayleen, et al.
Published: (2025)
by: Reynolds, Shayleen, et al.
Published: (2025)
Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models
by: Zhang, Hanlin, et al.
Published: (2023)
by: Zhang, Hanlin, et al.
Published: (2023)
Distortion-free Watermarks are not Truly Distortion-free under Watermark Key Collisions
by: Wu, Yihan, et al.
Published: (2024)
by: Wu, Yihan, et al.
Published: (2024)
Mitigating Error Amplification in Fast Adversarial Training
by: Zhao, Mengnan, et al.
Published: (2026)
by: Zhao, Mengnan, et al.
Published: (2026)
Similar Items
-
Differentially Private and Adversarially Robust Machine Learning: An Empirical Evaluation
by: Thakkar, Janvi, et al.
Published: (2024) -
HarmLevelBench: Evaluating Harm-Level Compliance and the Impact of Quantization on Model Alignment
by: Belkhiter, Yannis, et al.
Published: (2024) -
Blue Teaming Function-Calling Agents
by: Dolcetti, Greta, et al.
Published: (2026) -
Breaking MCP with Function Hijacking Attacks: Novel Threats for Function Calling and Agentic Models
by: Belkhiter, Yannis, et al.
Published: (2026) -
Towards a Practical Defense against Adversarial Attacks on Deep Learning-based Malware Detectors via Randomized Smoothing
by: Gibert, Daniel, et al.
Published: (2023)