Saved in:
Bibliographic Details
Main Authors: Gao, Jiashi, Wang, Ziwei, Zhao, Xiangyu, Shi, Xinming, Yao, Xin, Wei, Xuetao
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2410.06509
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908584535654400
author Gao, Jiashi
Wang, Ziwei
Zhao, Xiangyu
Shi, Xinming
Yao, Xin
Wei, Xuetao
author_facet Gao, Jiashi
Wang, Ziwei
Zhao, Xiangyu
Shi, Xinming
Yao, Xin
Wei, Xuetao
contents Federated learning (FL), integrating group fairness mechanisms, allows multiple clients to collaboratively train a global model that makes unbiased decisions for different populations grouped by sensitive attributes (e.g., gender and race). Due to its distributed nature, previous studies have demonstrated that FL systems are vulnerable to model poisoning attacks. However, these studies primarily focus on perturbing accuracy, leaving a critical question unexplored: Can an attacker bypass the group fairness mechanisms in FL and manipulate the global model to be biased? The motivations for such an attack vary; an attacker might seek higher accuracy, yet fairness considerations typically limit the accuracy of the global model or aim to cause ethical disruption. To address this question, we design a novel form of attack in FL, termed Profit-driven Fairness Attack (PFAttack), which aims not to degrade global model accuracy but to bypass fairness mechanisms. Our fundamental insight is that group fairness seeks to weaken the dependence of outputs on input attributes related to sensitive information. In the proposed PFAttack, an attacker can recover this dependence through local fine-tuning across various sensitive groups, thereby creating a biased yet accuracy-preserving malicious model and injecting it into FL through model replacement. Compared to attacks targeting accuracy, PFAttack is more stealthy. The malicious model in PFAttack exhibits subtle parameter variations relative to the original global model, making it robust against detection and filtering by Byzantine-resilient aggregations. Extensive experiments on benchmark datasets are conducted for four fair FL frameworks and three Byzantine-resilient aggregations against model poisoning, demonstrating the effectiveness and stealth of PFAttack in bypassing group fairness mechanisms in FL.
format Preprint
id arxiv_https___arxiv_org_abs_2410_06509
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle PFAttack: Stealthy Attack Bypassing Group Fairness in Federated Learning
Gao, Jiashi
Wang, Ziwei
Zhao, Xiangyu
Shi, Xinming
Yao, Xin
Wei, Xuetao
Machine Learning
Federated learning (FL), integrating group fairness mechanisms, allows multiple clients to collaboratively train a global model that makes unbiased decisions for different populations grouped by sensitive attributes (e.g., gender and race). Due to its distributed nature, previous studies have demonstrated that FL systems are vulnerable to model poisoning attacks. However, these studies primarily focus on perturbing accuracy, leaving a critical question unexplored: Can an attacker bypass the group fairness mechanisms in FL and manipulate the global model to be biased? The motivations for such an attack vary; an attacker might seek higher accuracy, yet fairness considerations typically limit the accuracy of the global model or aim to cause ethical disruption. To address this question, we design a novel form of attack in FL, termed Profit-driven Fairness Attack (PFAttack), which aims not to degrade global model accuracy but to bypass fairness mechanisms. Our fundamental insight is that group fairness seeks to weaken the dependence of outputs on input attributes related to sensitive information. In the proposed PFAttack, an attacker can recover this dependence through local fine-tuning across various sensitive groups, thereby creating a biased yet accuracy-preserving malicious model and injecting it into FL through model replacement. Compared to attacks targeting accuracy, PFAttack is more stealthy. The malicious model in PFAttack exhibits subtle parameter variations relative to the original global model, making it robust against detection and filtering by Byzantine-resilient aggregations. Extensive experiments on benchmark datasets are conducted for four fair FL frameworks and three Byzantine-resilient aggregations against model poisoning, demonstrating the effectiveness and stealth of PFAttack in bypassing group fairness mechanisms in FL.
title PFAttack: Stealthy Attack Bypassing Group Fairness in Federated Learning
topic Machine Learning
url https://arxiv.org/abs/2410.06509