Saved in:
Bibliographic Details
Main Authors: Vologdin, Mariia, Tao, Yuchao, Gilad, Amir
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.22952
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918517547204608
author Vologdin, Mariia
Tao, Yuchao
Gilad, Amir
author_facet Vologdin, Mariia
Tao, Yuchao
Gilad, Amir
contents Differential privacy (DP) has become the de facto standard for protecting sensitive data, providing strong guarantees that published statistics or models reveal limited information about any individual. However, privacy noise and restricted data access make it increasingly difficult to assess the fairness and reliability of private datasets. In this paper, we propose a formal framework for quantifying data unfairness under DP. We identify three core desiderata for unfairness measures based on previous work: positivity, monotonicity, and DP computability. We further instantiate them through three complementary measures: (1) a mutual information-based measure with a total variation distance proxy suitable for DP, (2) a data repair-based measure approximated via a reduction to weighted MaxSAT, and (3) a top-$k$ tuple contribution measure that isolates the most influential records in fairness violations. We design privacy-preserving algorithms and analyze their sensitivity, accuracy, and efficiency. Extensive experiments on multiple real-world datasets demonstrate that our proposed measures faithfully approximate their non-private counterparts, effectively quantify bias under privacy constraints, and provide insights for data management.
format Preprint
id arxiv_https___arxiv_org_abs_2605_22952
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Measuring Database Unfairness via Dependency Quantification Under Differential Privacy
Vologdin, Mariia
Tao, Yuchao
Gilad, Amir
Databases
Differential privacy (DP) has become the de facto standard for protecting sensitive data, providing strong guarantees that published statistics or models reveal limited information about any individual. However, privacy noise and restricted data access make it increasingly difficult to assess the fairness and reliability of private datasets. In this paper, we propose a formal framework for quantifying data unfairness under DP. We identify three core desiderata for unfairness measures based on previous work: positivity, monotonicity, and DP computability. We further instantiate them through three complementary measures: (1) a mutual information-based measure with a total variation distance proxy suitable for DP, (2) a data repair-based measure approximated via a reduction to weighted MaxSAT, and (3) a top-$k$ tuple contribution measure that isolates the most influential records in fairness violations. We design privacy-preserving algorithms and analyze their sensitivity, accuracy, and efficiency. Extensive experiments on multiple real-world datasets demonstrate that our proposed measures faithfully approximate their non-private counterparts, effectively quantify bias under privacy constraints, and provide insights for data management.
title Measuring Database Unfairness via Dependency Quantification Under Differential Privacy
topic Databases
url https://arxiv.org/abs/2605.22952