Saved in:
Bibliographic Details
Main Authors: Laouir, Ala Eddine, Imine, Abdessamad
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.11421
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916289780383744
author Laouir, Ala Eddine
Imine, Abdessamad
author_facet Laouir, Ala Eddine
Imine, Abdessamad
contents In many real-world scenarios, multiple data providers need to collaboratively perform analysis of their private data. The challenges of these applications, especially at the big data scale, are time and resource efficiency as well as end-to-end privacy with minimal loss of accuracy. Existing approaches rely primarily on cryptography, which improves privacy, but at the expense of query response time. However, current big data analytics frameworks require fast and accurate responses to large-scale queries, making cryptography-based solutions less suitable. In this work, we address the problem of combining Approximate Query Processing (AQP) and Differential Privacy (DP) in a private federated environment answering range queries on horizontally partitioned multidimensional data. We propose a new approach that considers a data distribution-aware online sampling technique to accelerate the execution of range queries and ensure end-to-end data privacy during and after analysis with minimal loss in accuracy. Through empirical evaluation, we show that our solution is able of providing up to 8 times faster processing than the basic non-secure solution while maintaining accuracy, formal privacy guarantees and resilience to learning-based attacks.
format Preprint
id arxiv_https___arxiv_org_abs_2406_11421
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Private Approximate Query over Horizontal Data Federation
Laouir, Ala Eddine
Imine, Abdessamad
Databases
Cryptography and Security
H.2.8
In many real-world scenarios, multiple data providers need to collaboratively perform analysis of their private data. The challenges of these applications, especially at the big data scale, are time and resource efficiency as well as end-to-end privacy with minimal loss of accuracy. Existing approaches rely primarily on cryptography, which improves privacy, but at the expense of query response time. However, current big data analytics frameworks require fast and accurate responses to large-scale queries, making cryptography-based solutions less suitable. In this work, we address the problem of combining Approximate Query Processing (AQP) and Differential Privacy (DP) in a private federated environment answering range queries on horizontally partitioned multidimensional data. We propose a new approach that considers a data distribution-aware online sampling technique to accelerate the execution of range queries and ensure end-to-end data privacy during and after analysis with minimal loss in accuracy. Through empirical evaluation, we show that our solution is able of providing up to 8 times faster processing than the basic non-secure solution while maintaining accuracy, formal privacy guarantees and resilience to learning-based attacks.
title Private Approximate Query over Horizontal Data Federation
topic Databases
Cryptography and Security
H.2.8
url https://arxiv.org/abs/2406.11421