Saved in:
Bibliographic Details
Main Authors: Hong, Kihun, Park, Sejun, Hwang, Ganguk
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2505.11035
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917367063248896
author Hong, Kihun
Park, Sejun
Hwang, Ganguk
author_facet Hong, Kihun
Park, Sejun
Hwang, Ganguk
contents Federated learning (FL) has attracted significant attention for enabling collaborative learning without exposing private data. Among the primary variants of FL, vertical federated learning (VFL) addresses feature-partitioned data held by multiple institutions, each holding complementary information for the same set of users. However, existing VFL methods often impose restrictive assumptions such as a small number of participating parties, fully aligned data, or only using labeled data. In this work, we reinterpret alignment gaps in VFL as missing data problems and propose a unified framework that accommodates both training and inference under arbitrary alignment and labeling scenarios, while supporting diverse missingness mechanisms. In the experiments on 168 configurations spanning four benchmark datasets, six training-time missingness patterns, and seven testing-time missingness patterns, our method outperforms all baselines in 160 cases with an average gap of 9.6 percentage points over the next-best competitors. To the best of our knowledge, this is the first VFL framework to jointly handle arbitrary data alignment, unlabeled data, and multi-party collaboration all at once.
format Preprint
id arxiv_https___arxiv_org_abs_2505_11035
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Deep Latent Variable Model based Vertical Federated Learning with Flexible Alignment and Labeling Scenarios
Hong, Kihun
Park, Sejun
Hwang, Ganguk
Machine Learning
Federated learning (FL) has attracted significant attention for enabling collaborative learning without exposing private data. Among the primary variants of FL, vertical federated learning (VFL) addresses feature-partitioned data held by multiple institutions, each holding complementary information for the same set of users. However, existing VFL methods often impose restrictive assumptions such as a small number of participating parties, fully aligned data, or only using labeled data. In this work, we reinterpret alignment gaps in VFL as missing data problems and propose a unified framework that accommodates both training and inference under arbitrary alignment and labeling scenarios, while supporting diverse missingness mechanisms. In the experiments on 168 configurations spanning four benchmark datasets, six training-time missingness patterns, and seven testing-time missingness patterns, our method outperforms all baselines in 160 cases with an average gap of 9.6 percentage points over the next-best competitors. To the best of our knowledge, this is the first VFL framework to jointly handle arbitrary data alignment, unlabeled data, and multi-party collaboration all at once.
title Deep Latent Variable Model based Vertical Federated Learning with Flexible Alignment and Labeling Scenarios
topic Machine Learning
url https://arxiv.org/abs/2505.11035