Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lafargue, Valentin, Monteiro, Adriana Laurindo, Claeys, Emmanuelle, Risser, Laurent, Loubes, Jean-Michel
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Optimization and Control Applications
Online Access:	https://arxiv.org/abs/2507.20708
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917325464141824
author	Lafargue, Valentin Monteiro, Adriana Laurindo Claeys, Emmanuelle Risser, Laurent Loubes, Jean-Michel
author_facet	Lafargue, Valentin Monteiro, Adriana Laurindo Claeys, Emmanuelle Risser, Laurent Loubes, Jean-Michel
contents	The rapid deployment of AI systems in high-stakes domains, including those classified as high-risk under the The EU AI Act (Regulation (EU) 2024/1689), has intensified the need for reliable compliance auditing. For binary classifiers, regulatory risk assessment often relies on global fairness metrics such as the Disparate Impact ratio, widely used to evaluate potential discrimination. In typical auditing settings, the auditee provides a subset of its dataset to an auditor, while a supervisory authority may verify whether this subset is representative of the full underlying distribution. In this work, we investigate to what extent a malicious auditee can construct a fairness-compliant yet representative-looking sample from a non-compliant original distribution, thereby creating an illusion of fairness. We formalize this problem as a constrained distributional projection task and introduce mathematically grounded manipulation strategies based on entropic and optimal transport projections. These constructions characterize the minimal distributional shift required to satisfy fairness constraints. To counter such attacks, we formalize representativeness through distributional distance based statistical tests and systematically evaluate their ability to detect manipulated samples. Our analysis highlights the conditions under which fairness manipulation can remain statistically undetected and provides practical guidelines for strengthening supervisory verification. We validate our theoretical findings through experiments on standard tabular datasets for bias detection. Code is publicly available at https://github.com/ValentinLafargue/Inspection.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_20708
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Exposing the Illusion of Fairness: Auditing Vulnerabilities to Distributional Manipulation Attacks Lafargue, Valentin Monteiro, Adriana Laurindo Claeys, Emmanuelle Risser, Laurent Loubes, Jean-Michel Machine Learning Optimization and Control Applications The rapid deployment of AI systems in high-stakes domains, including those classified as high-risk under the The EU AI Act (Regulation (EU) 2024/1689), has intensified the need for reliable compliance auditing. For binary classifiers, regulatory risk assessment often relies on global fairness metrics such as the Disparate Impact ratio, widely used to evaluate potential discrimination. In typical auditing settings, the auditee provides a subset of its dataset to an auditor, while a supervisory authority may verify whether this subset is representative of the full underlying distribution. In this work, we investigate to what extent a malicious auditee can construct a fairness-compliant yet representative-looking sample from a non-compliant original distribution, thereby creating an illusion of fairness. We formalize this problem as a constrained distributional projection task and introduce mathematically grounded manipulation strategies based on entropic and optimal transport projections. These constructions characterize the minimal distributional shift required to satisfy fairness constraints. To counter such attacks, we formalize representativeness through distributional distance based statistical tests and systematically evaluate their ability to detect manipulated samples. Our analysis highlights the conditions under which fairness manipulation can remain statistically undetected and provides practical guidelines for strengthening supervisory verification. We validate our theoretical findings through experiments on standard tabular datasets for bias detection. Code is publicly available at https://github.com/ValentinLafargue/Inspection.
title	Exposing the Illusion of Fairness: Auditing Vulnerabilities to Distributional Manipulation Attacks
topic	Machine Learning Optimization and Control Applications
url	https://arxiv.org/abs/2507.20708

Similar Items