Saved in:
Bibliographic Details
Main Authors: Guo, Jiajun, Luo, Xin, Zheng, Jiayin, Wang, Yiqun, Chang, Kai-Wei, Wang, Wei, Liu, Jie
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2511.23402
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910008173658112
author Guo, Jiajun
Luo, Xin
Zheng, Jiayin
Wang, Yiqun
Chang, Kai-Wei
Wang, Wei
Liu, Jie
author_facet Guo, Jiajun
Luo, Xin
Zheng, Jiayin
Wang, Yiqun
Chang, Kai-Wei
Wang, Wei
Liu, Jie
contents Multimodal foundation models are increasingly trained on sensitive data across domains such as finance, biomedicine, and personal identifiers. However, this distributed setup raises serious privacy concerns due to the need for cross-partition data sharing. Split learning addresses these concerns by enabling collaborative model training without raw data exchange between partitions, yet it introduces a significant challenge: transmitting high-dimensional intermediate feature representations between partitions leads to substantial communication costs. To address this challenge, we propose Quantized-TinyLLaVA, a multimodal foundation model with an integrated communication-efficient split learning framework. Our approach adopts a compression module that quantizes intermediate feature into discrete representations before transmission, substantially reducing communication overhead. Besides, we derive a principled quantization strategy grounded in entropy coding theory to determine the optimal number of discrete representation levels. We deploy our framework in a two-partition setting, with one partition operating as the client and the other as the server, to realistically simulate distributed training. Under this setup, Quantized-TinyLLaVA achieves an approximate \textbf{87.5\%} reduction in communication overhead with 2-bit quantization, while maintaining performance of the original 16-bit model across five benchmark datasets. Furthermore, our compressed representations exhibit enhanced resilience against feature inversion attacks, validating the privacy of transmission. The code is available at https://github.com/anonymous-1742/Quantized-TinyLLaVA.
format Preprint
id arxiv_https___arxiv_org_abs_2511_23402
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Quantized-Tinyllava: a new multimodal foundation model enables efficient split learning
Guo, Jiajun
Luo, Xin
Zheng, Jiayin
Wang, Yiqun
Chang, Kai-Wei
Wang, Wei
Liu, Jie
Machine Learning
Multimodal foundation models are increasingly trained on sensitive data across domains such as finance, biomedicine, and personal identifiers. However, this distributed setup raises serious privacy concerns due to the need for cross-partition data sharing. Split learning addresses these concerns by enabling collaborative model training without raw data exchange between partitions, yet it introduces a significant challenge: transmitting high-dimensional intermediate feature representations between partitions leads to substantial communication costs. To address this challenge, we propose Quantized-TinyLLaVA, a multimodal foundation model with an integrated communication-efficient split learning framework. Our approach adopts a compression module that quantizes intermediate feature into discrete representations before transmission, substantially reducing communication overhead. Besides, we derive a principled quantization strategy grounded in entropy coding theory to determine the optimal number of discrete representation levels. We deploy our framework in a two-partition setting, with one partition operating as the client and the other as the server, to realistically simulate distributed training. Under this setup, Quantized-TinyLLaVA achieves an approximate \textbf{87.5\%} reduction in communication overhead with 2-bit quantization, while maintaining performance of the original 16-bit model across five benchmark datasets. Furthermore, our compressed representations exhibit enhanced resilience against feature inversion attacks, validating the privacy of transmission. The code is available at https://github.com/anonymous-1742/Quantized-TinyLLaVA.
title Quantized-Tinyllava: a new multimodal foundation model enables efficient split learning
topic Machine Learning
url https://arxiv.org/abs/2511.23402