MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Ghorbel, Mahmoud, Bouzidi, Halima, Bilasco, Ioan Marius, Alouani, Ihsen
Natura:	Preprint
Pubblicazione:	2024
Soggetti:	Cryptography and Security Computer Vision and Pattern Recognition Machine Learning
Accesso online:	https://arxiv.org/abs/2406.01708
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866908315776188416
author	Ghorbel, Mahmoud Bouzidi, Halima Bilasco, Ioan Marius Alouani, Ihsen
author_facet	Ghorbel, Mahmoud Bouzidi, Halima Bilasco, Ioan Marius Alouani, Ihsen
contents	Model hijacking can cause significant accountability and security risks since the owner of a hijacked model can be framed for having their model offer illegal or unethical services. Prior works consider model hijacking as a training time attack, whereby an adversary requires full access to the ML model training. In this paper, we consider a stronger threat model for an inference-time hijacking attack, where the adversary has no access to the training phase of the victim model. Our intuition is that ML models, which are typically over-parameterized, might have the capacity to (unintentionally) learn more than the intended task they are trained for. We propose SnatchML, a new training-free model hijacking attack, that leverages the extra capacity learnt by the victim model to infer different tasks that can be semantically related or unrelated to the original one. Our results on models deployed on AWS Sagemaker showed that SnatchML can deliver high accuracy on hijacking tasks. Interestingly, while all previous approaches are limited by the number of classes in the benign task, SnatchML can hijack models for tasks that contain more classes than the original. We explore different methods to mitigate this risk; We propose meta-unlearning, which is designed to help the model unlearn a potentially malicious task while training for the original task. We also provide insights on over-parametrization as a possible inherent factor that facilitates model hijacking, and accordingly, we propose a compression-based countermeasure to counteract this attack. We believe this work offers a previously overlooked perspective on model hijacking attacks, presenting a stronger threat model and higher applicability in real-world contexts.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_01708
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	SnatchML: Hijacking ML models without Training Access Ghorbel, Mahmoud Bouzidi, Halima Bilasco, Ioan Marius Alouani, Ihsen Cryptography and Security Computer Vision and Pattern Recognition Machine Learning Model hijacking can cause significant accountability and security risks since the owner of a hijacked model can be framed for having their model offer illegal or unethical services. Prior works consider model hijacking as a training time attack, whereby an adversary requires full access to the ML model training. In this paper, we consider a stronger threat model for an inference-time hijacking attack, where the adversary has no access to the training phase of the victim model. Our intuition is that ML models, which are typically over-parameterized, might have the capacity to (unintentionally) learn more than the intended task they are trained for. We propose SnatchML, a new training-free model hijacking attack, that leverages the extra capacity learnt by the victim model to infer different tasks that can be semantically related or unrelated to the original one. Our results on models deployed on AWS Sagemaker showed that SnatchML can deliver high accuracy on hijacking tasks. Interestingly, while all previous approaches are limited by the number of classes in the benign task, SnatchML can hijack models for tasks that contain more classes than the original. We explore different methods to mitigate this risk; We propose meta-unlearning, which is designed to help the model unlearn a potentially malicious task while training for the original task. We also provide insights on over-parametrization as a possible inherent factor that facilitates model hijacking, and accordingly, we propose a compression-based countermeasure to counteract this attack. We believe this work offers a previously overlooked perspective on model hijacking attacks, presenting a stronger threat model and higher applicability in real-world contexts.
title	SnatchML: Hijacking ML models without Training Access
topic	Cryptography and Security Computer Vision and Pattern Recognition Machine Learning
url	https://arxiv.org/abs/2406.01708

Documenti analoghi