Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Cotorobai, Alexandre, Silva, Jorge Miguel, Oliveira, Jose Luis
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2505.08085
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918100719370240
author	Cotorobai, Alexandre Silva, Jorge Miguel Oliveira, Jose Luis
author_facet	Cotorobai, Alexandre Silva, Jorge Miguel Oliveira, Jose Luis
contents	Privacy and regulatory barriers often hinder centralized machine learning solutions, particularly in sectors like healthcare where data cannot be freely shared. Federated learning has emerged as a powerful paradigm to address these concerns; however, existing frameworks primarily support gradient-based models, leaving a gap for more interpretable, tree-based approaches. This paper introduces a federated learning framework for Random Forest classifiers that preserves data privacy and provides robust performance in distributed settings. By leveraging PySyft for secure, privacy-aware computation, our method enables multiple institutions to collaboratively train Random Forest models on locally stored data without exposing sensitive information. The framework supports weighted model averaging to account for varying data distributions, incremental learning to progressively refine models, and local evaluation to assess performance across heterogeneous datasets. Experiments on two real-world healthcare benchmarks demonstrate that the federated approach maintains competitive predictive accuracy - within a maximum 9\% margin of centralized methods - while satisfying stringent privacy requirements. These findings underscore the viability of tree-based federated learning for scenarios where data cannot be centralized due to regulatory, competitive, or technical constraints. The proposed solution addresses a notable gap in existing federated learning libraries, offering an adaptable tool for secure distributed machine learning tasks that demand both transparency and reliable performance. The tool is available at https://github.com/ieeta-pt/fed_rf.
format	Preprint
id	arxiv_https___arxiv_org_abs_2505_08085
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	A Federated Random Forest Solution for Secure Distributed Machine Learning Cotorobai, Alexandre Silva, Jorge Miguel Oliveira, Jose Luis Machine Learning Privacy and regulatory barriers often hinder centralized machine learning solutions, particularly in sectors like healthcare where data cannot be freely shared. Federated learning has emerged as a powerful paradigm to address these concerns; however, existing frameworks primarily support gradient-based models, leaving a gap for more interpretable, tree-based approaches. This paper introduces a federated learning framework for Random Forest classifiers that preserves data privacy and provides robust performance in distributed settings. By leveraging PySyft for secure, privacy-aware computation, our method enables multiple institutions to collaboratively train Random Forest models on locally stored data without exposing sensitive information. The framework supports weighted model averaging to account for varying data distributions, incremental learning to progressively refine models, and local evaluation to assess performance across heterogeneous datasets. Experiments on two real-world healthcare benchmarks demonstrate that the federated approach maintains competitive predictive accuracy - within a maximum 9\% margin of centralized methods - while satisfying stringent privacy requirements. These findings underscore the viability of tree-based federated learning for scenarios where data cannot be centralized due to regulatory, competitive, or technical constraints. The proposed solution addresses a notable gap in existing federated learning libraries, offering an adaptable tool for secure distributed machine learning tasks that demand both transparency and reliable performance. The tool is available at https://github.com/ieeta-pt/fed_rf.
title	A Federated Random Forest Solution for Secure Distributed Machine Learning
topic	Machine Learning
url	https://arxiv.org/abs/2505.08085

Similar Items