Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.06349 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866916743511801856 |
|---|---|
| author | Jang, Won-Jun Park, Hyeon-Seo Lee, Si-Hyeon |
| author_facet | Jang, Won-Jun Park, Hyeon-Seo Lee, Si-Hyeon |
| contents | Federated ensemble distillation addresses client heterogeneity by generating pseudo-labels for an unlabeled server dataset based on client predictions and training the server model using the pseudo-labeled dataset. The unlabeled server dataset can either be pre-existing or generated through a data-free approach. The effectiveness of this approach critically depends on the method of assigning weights to client predictions when creating pseudo-labels, especially in highly heterogeneous settings. Inspired by theoretical results from GANs, we propose a provably near-optimal weighting method that leverages client discriminators trained with a server-distributed generator and local datasets. Our experiments on various image classification tasks demonstrate that the proposed method significantly outperforms baselines. Furthermore, we show that the additional communication cost, client-side privacy leakage, and client-side computational overhead introduced by our method are negligible, both in scenarios with and without a pre-existing server dataset. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2502_06349 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Provably Near-Optimal Federated Ensemble Distillation with Negligible Overhead Jang, Won-Jun Park, Hyeon-Seo Lee, Si-Hyeon Machine Learning Federated ensemble distillation addresses client heterogeneity by generating pseudo-labels for an unlabeled server dataset based on client predictions and training the server model using the pseudo-labeled dataset. The unlabeled server dataset can either be pre-existing or generated through a data-free approach. The effectiveness of this approach critically depends on the method of assigning weights to client predictions when creating pseudo-labels, especially in highly heterogeneous settings. Inspired by theoretical results from GANs, we propose a provably near-optimal weighting method that leverages client discriminators trained with a server-distributed generator and local datasets. Our experiments on various image classification tasks demonstrate that the proposed method significantly outperforms baselines. Furthermore, we show that the additional communication cost, client-side privacy leakage, and client-side computational overhead introduced by our method are negligible, both in scenarios with and without a pre-existing server dataset. |
| title | Provably Near-Optimal Federated Ensemble Distillation with Negligible Overhead |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2502.06349 |