Henkilökuntanäyttö: :: Library Catalog

Tallennettuna:

Bibliografiset tiedot
Päätekijät:	Kulcsar, Jeremy, Kungurtsev, Vyacheslav, Korpas, Georgios, Giaconi, Giulio, Shoosmith, William
Aineistotyyppi:	Preprint
Julkaistu:	2025
Aiheet:	Distributed, Parallel, and Cluster Computing Machine Learning
Linkit:	https://arxiv.org/abs/2502.07021
Tagit:	Lisää tagi Ei tageja, Lisää ensimmäinen tagi!

_version_	1866915779375529984
author	Kulcsar, Jeremy Kungurtsev, Vyacheslav Korpas, Georgios Giaconi, Giulio Shoosmith, William
author_facet	Kulcsar, Jeremy Kungurtsev, Vyacheslav Korpas, Georgios Giaconi, Giulio Shoosmith, William
contents	We study distributed Sinkhorn iterations for entropy-regularized optimal transport when the Gibbs kernel operator is row-partitioned across c workers and cannot be centralized. We present Federated Sinkhorn, two exact synchronous protocols that exchange only scaling-vector slices: (i) an All-to-All scheme implemented by Allgather, and (ii) a Star (parameter-server) scheme implemented by client to server sends and server to client broadcasts. For both, we derive closed-form per-iteration compute, communication, and memory costs under an alpha-beta latency--bandwidth model, and show that the distributed iterates match centralized Sinkhorn under standard positivity assumptions. Multi-node CPU/GPU experiments validate the model and show that repeated global scaling exchange quickly becomes the dominant bottleneck as c increases. We also report an optional bounded-delay asynchronous schedule and an optional privacy measurement layer for communicated log-scalings.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_07021
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Federated Sinkhorn Kulcsar, Jeremy Kungurtsev, Vyacheslav Korpas, Georgios Giaconi, Giulio Shoosmith, William Distributed, Parallel, and Cluster Computing Machine Learning We study distributed Sinkhorn iterations for entropy-regularized optimal transport when the Gibbs kernel operator is row-partitioned across c workers and cannot be centralized. We present Federated Sinkhorn, two exact synchronous protocols that exchange only scaling-vector slices: (i) an All-to-All scheme implemented by Allgather, and (ii) a Star (parameter-server) scheme implemented by client to server sends and server to client broadcasts. For both, we derive closed-form per-iteration compute, communication, and memory costs under an alpha-beta latency--bandwidth model, and show that the distributed iterates match centralized Sinkhorn under standard positivity assumptions. Multi-node CPU/GPU experiments validate the model and show that repeated global scaling exchange quickly becomes the dominant bottleneck as c increases. We also report an optional bounded-delay asynchronous schedule and an optional privacy measurement layer for communicated log-scalings.
title	Federated Sinkhorn
topic	Distributed, Parallel, and Cluster Computing Machine Learning
url	https://arxiv.org/abs/2502.07021

Samankaltaisia teoksia