Guardado en:
Detalles Bibliográficos
Autores principales: Kou, Zhoubin, Chen, Zihan, Yang, Jing, Shen, Cong
Formato: Preprint
Publicado: 2026
Materias:
Acceso en línea:https://arxiv.org/abs/2601.09076
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866915727926099968
author Kou, Zhoubin
Chen, Zihan
Yang, Jing
Shen, Cong
author_facet Kou, Zhoubin
Chen, Zihan
Yang, Jing
Shen, Cong
contents Split Federated Learning (SFL) enables collaborative training between resource-constrained edge devices and a compute-rich server. Communication overhead is a central issue in SFL and can be mitigated with auxiliary networks. Yet, the fundamental client-side computation challenge remains, as back-propagation requires substantial memory and computation costs, severely limiting the scale of models that edge devices can support. To enable more resource-efficient client computation and reduce the client-server communication, we propose HERON-SFL, a novel hybrid optimization framework that integrates zeroth-order (ZO) optimization for local client training while retaining first-order (FO) optimization on the server. With the assistance of auxiliary networks, ZO updates enable clients to approximate local gradients using perturbed forward-only evaluations per step, eliminating memory-intensive activation caching and avoiding explicit gradient computation in the traditional training process. Leveraging the low effective rank assumption, we theoretically prove that HERON-SFL's convergence rate is independent of model dimensionality, addressing a key scalability concern common to ZO algorithms. Empirically, on ResNet training and language model (LM) fine-tuning tasks, HERON-SFL matches benchmark accuracy while reducing client peak memory by up to 64% and client-side compute cost by up to 33% per step, substantially expanding the range of models that can be trained or adapted on resource-limited devices.
format Preprint
id arxiv_https___arxiv_org_abs_2601_09076
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Lean Clients, Full Accuracy: Hybrid Zeroth- and First-Order Split Federated Learning
Kou, Zhoubin
Chen, Zihan
Yang, Jing
Shen, Cong
Machine Learning
Distributed, Parallel, and Cluster Computing
Information Theory
Networking and Internet Architecture
Signal Processing
Split Federated Learning (SFL) enables collaborative training between resource-constrained edge devices and a compute-rich server. Communication overhead is a central issue in SFL and can be mitigated with auxiliary networks. Yet, the fundamental client-side computation challenge remains, as back-propagation requires substantial memory and computation costs, severely limiting the scale of models that edge devices can support. To enable more resource-efficient client computation and reduce the client-server communication, we propose HERON-SFL, a novel hybrid optimization framework that integrates zeroth-order (ZO) optimization for local client training while retaining first-order (FO) optimization on the server. With the assistance of auxiliary networks, ZO updates enable clients to approximate local gradients using perturbed forward-only evaluations per step, eliminating memory-intensive activation caching and avoiding explicit gradient computation in the traditional training process. Leveraging the low effective rank assumption, we theoretically prove that HERON-SFL's convergence rate is independent of model dimensionality, addressing a key scalability concern common to ZO algorithms. Empirically, on ResNet training and language model (LM) fine-tuning tasks, HERON-SFL matches benchmark accuracy while reducing client peak memory by up to 64% and client-side compute cost by up to 33% per step, substantially expanding the range of models that can be trained or adapted on resource-limited devices.
title Lean Clients, Full Accuracy: Hybrid Zeroth- and First-Order Split Federated Learning
topic Machine Learning
Distributed, Parallel, and Cluster Computing
Information Theory
Networking and Internet Architecture
Signal Processing
url https://arxiv.org/abs/2601.09076