Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Ezzeddine, Fatima, Zammar, Osama, Giordano, Silvia, Ayoub, Omran
Formato:	Preprint
Publicado:	2026
Materias:	Machine Learning
Acceso en línea:	https://arxiv.org/abs/2602.03611
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866915771474509824
author	Ezzeddine, Fatima Zammar, Osama Giordano, Silvia Ayoub, Omran
author_facet	Ezzeddine, Fatima Zammar, Osama Giordano, Silvia Ayoub, Omran
contents	Counterfactual explanations (CFs) are increasingly integrated into Machine Learning as a Service (MLaaS) systems to improve transparency; however, ML models deployed via APIs are already vulnerable to privacy attacks such as membership inference and model extraction, and the impact of explanations on this threat landscape remains insufficiently understood. In this work, we focus on the problem of how CFs expand the attack surface of MLaaS by strengthening membership inference attacks (MIAs), and on the need to design defense mechanisms that mitigate this emerging risk without undermining utility and explainability. First, we systematically analyze how exposing CFs through query-based APIs enables more effective shadow-based MIAs. Second, we propose a defense framework that integrates Differential Privacy (DP) with Active Learning (AL) to jointly reduce memorization and limit effective training data exposure. Finally, we conduct an extensive empirical evaluation to characterize the three-way trade-off between privacy leakage, predictive performance, and explanation quality. Our findings highlight the need to carefully balance transparency, utility, and privacy in the responsible deployment of explainable MLaaS systems.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_03611
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Explanations Leak: Membership Inference with Differential Privacy and Active Learning Defense Ezzeddine, Fatima Zammar, Osama Giordano, Silvia Ayoub, Omran Machine Learning Counterfactual explanations (CFs) are increasingly integrated into Machine Learning as a Service (MLaaS) systems to improve transparency; however, ML models deployed via APIs are already vulnerable to privacy attacks such as membership inference and model extraction, and the impact of explanations on this threat landscape remains insufficiently understood. In this work, we focus on the problem of how CFs expand the attack surface of MLaaS by strengthening membership inference attacks (MIAs), and on the need to design defense mechanisms that mitigate this emerging risk without undermining utility and explainability. First, we systematically analyze how exposing CFs through query-based APIs enables more effective shadow-based MIAs. Second, we propose a defense framework that integrates Differential Privacy (DP) with Active Learning (AL) to jointly reduce memorization and limit effective training data exposure. Finally, we conduct an extensive empirical evaluation to characterize the three-way trade-off between privacy leakage, predictive performance, and explanation quality. Our findings highlight the need to carefully balance transparency, utility, and privacy in the responsible deployment of explainable MLaaS systems.
title	Explanations Leak: Membership Inference with Differential Privacy and Active Learning Defense
topic	Machine Learning
url	https://arxiv.org/abs/2602.03611

Ejemplares similares