Saved in:
Bibliographic Details
Main Authors: Wang, Peiran, Wang, Haohan
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2410.08665
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929537916338176
author Wang, Peiran
Wang, Haohan
author_facet Wang, Peiran
Wang, Haohan
contents In this paper, we introduce DistDD, a novel approach within the federated learning framework that reduces the need for repetitive communication by distilling data directly on clients' devices. Unlike traditional federated learning that requires iterative model updates across nodes, DistDD facilitates a one-time distillation process that extracts a global distilled dataset, maintaining the privacy standards of federated learning while significantly cutting down communication costs. By leveraging the DistDD's distilled dataset, the developers of the FL can achieve just-in-time parameter tuning and neural architecture search over FL without repeating the whole FL process multiple times. We provide a detailed convergence proof of the DistDD algorithm, reinforcing its mathematical stability and reliability for practical applications. Our experiments demonstrate the effectiveness and robustness of DistDD, particularly in non-i.i.d. and mislabeled data scenarios, showcasing its potential to handle complex real-world data challenges distinctively from conventional federated learning methods. We also evaluate DistDD's application in the use case and prove its effectiveness and communication-savings in the NAS use case.
format Preprint
id arxiv_https___arxiv_org_abs_2410_08665
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle DistDD: Distributed Data Distillation Aggregation through Gradient Matching
Wang, Peiran
Wang, Haohan
Machine Learning
Artificial Intelligence
In this paper, we introduce DistDD, a novel approach within the federated learning framework that reduces the need for repetitive communication by distilling data directly on clients' devices. Unlike traditional federated learning that requires iterative model updates across nodes, DistDD facilitates a one-time distillation process that extracts a global distilled dataset, maintaining the privacy standards of federated learning while significantly cutting down communication costs. By leveraging the DistDD's distilled dataset, the developers of the FL can achieve just-in-time parameter tuning and neural architecture search over FL without repeating the whole FL process multiple times. We provide a detailed convergence proof of the DistDD algorithm, reinforcing its mathematical stability and reliability for practical applications. Our experiments demonstrate the effectiveness and robustness of DistDD, particularly in non-i.i.d. and mislabeled data scenarios, showcasing its potential to handle complex real-world data challenges distinctively from conventional federated learning methods. We also evaluate DistDD's application in the use case and prove its effectiveness and communication-savings in the NAS use case.
title DistDD: Distributed Data Distillation Aggregation through Gradient Matching
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2410.08665