MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Tadi, Ali Abbasi, Alhadidi, Dima, Rueda, Luis
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Machine Learning Cryptography and Security
Accesso online:	https://arxiv.org/abs/2501.11706
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866912196845371392
author	Tadi, Ali Abbasi Alhadidi, Dima Rueda, Luis
author_facet	Tadi, Ali Abbasi Alhadidi, Dima Rueda, Luis
contents	Transformers, a cornerstone of deep-learning architectures for sequential data, have achieved state-of-the-art results in tasks like Natural Language Processing (NLP). Models such as BERT and GPT-3 exemplify their success and have driven the rise of large language models (LLMs). However, a critical challenge persists: safeguarding the privacy of data used in LLM training. Privacy-preserving techniques like Federated Learning (FL) offer potential solutions, but practical limitations hinder their effectiveness for Transformer training. Two primary issues are (I) the risk of sensitive information leakage due to aggregation methods like FedAvg or FedSGD, and (II) the high communication overhead caused by the large size of Transformer models. This paper introduces a novel FL method that reduces communication overhead while maintaining competitive utility. Our approach avoids sharing full model weights by simulating a global model locally. We apply k-means clustering to each Transformer layer, compute centroids locally, and transmit only these centroids to the server instead of full weights or gradients. To enhance security, we leverage Intel SGX for secure transmission of centroids. Evaluated on a translation task, our method achieves utility comparable to state-of-the-art baselines while significantly reducing communication costs. This provides a more efficient and privacy-preserving FL solution for Transformer models.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_11706
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Trustformer: A Trusted Federated Transformer Tadi, Ali Abbasi Alhadidi, Dima Rueda, Luis Machine Learning Cryptography and Security Transformers, a cornerstone of deep-learning architectures for sequential data, have achieved state-of-the-art results in tasks like Natural Language Processing (NLP). Models such as BERT and GPT-3 exemplify their success and have driven the rise of large language models (LLMs). However, a critical challenge persists: safeguarding the privacy of data used in LLM training. Privacy-preserving techniques like Federated Learning (FL) offer potential solutions, but practical limitations hinder their effectiveness for Transformer training. Two primary issues are (I) the risk of sensitive information leakage due to aggregation methods like FedAvg or FedSGD, and (II) the high communication overhead caused by the large size of Transformer models. This paper introduces a novel FL method that reduces communication overhead while maintaining competitive utility. Our approach avoids sharing full model weights by simulating a global model locally. We apply k-means clustering to each Transformer layer, compute centroids locally, and transmit only these centroids to the server instead of full weights or gradients. To enhance security, we leverage Intel SGX for secure transmission of centroids. Evaluated on a translation task, our method achieves utility comparable to state-of-the-art baselines while significantly reducing communication costs. This provides a more efficient and privacy-preserving FL solution for Transformer models.
title	Trustformer: A Trusted Federated Transformer
topic	Machine Learning Cryptography and Security
url	https://arxiv.org/abs/2501.11706

Documenti analoghi