Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Guo, Jiajun, Luo, Xin, Zheng, Jiayin, Wang, Yiqun, Chang, Kai-Wei, Wang, Wei, Liu, Jie
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2511.23402
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910008173658112
author	Guo, Jiajun Luo, Xin Zheng, Jiayin Wang, Yiqun Chang, Kai-Wei Wang, Wei Liu, Jie
author_facet	Guo, Jiajun Luo, Xin Zheng, Jiayin Wang, Yiqun Chang, Kai-Wei Wang, Wei Liu, Jie
contents	Multimodal foundation models are increasingly trained on sensitive data across domains such as finance, biomedicine, and personal identifiers. However, this distributed setup raises serious privacy concerns due to the need for cross-partition data sharing. Split learning addresses these concerns by enabling collaborative model training without raw data exchange between partitions, yet it introduces a significant challenge: transmitting high-dimensional intermediate feature representations between partitions leads to substantial communication costs. To address this challenge, we propose Quantized-TinyLLaVA, a multimodal foundation model with an integrated communication-efficient split learning framework. Our approach adopts a compression module that quantizes intermediate feature into discrete representations before transmission, substantially reducing communication overhead. Besides, we derive a principled quantization strategy grounded in entropy coding theory to determine the optimal number of discrete representation levels. We deploy our framework in a two-partition setting, with one partition operating as the client and the other as the server, to realistically simulate distributed training. Under this setup, Quantized-TinyLLaVA achieves an approximate \textbf{87.5\%} reduction in communication overhead with 2-bit quantization, while maintaining performance of the original 16-bit model across five benchmark datasets. Furthermore, our compressed representations exhibit enhanced resilience against feature inversion attacks, validating the privacy of transmission. The code is available at https://github.com/anonymous-1742/Quantized-TinyLLaVA.
format	Preprint
id	arxiv_https___arxiv_org_abs_2511_23402
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Quantized-Tinyllava: a new multimodal foundation model enables efficient split learning Guo, Jiajun Luo, Xin Zheng, Jiayin Wang, Yiqun Chang, Kai-Wei Wang, Wei Liu, Jie Machine Learning Multimodal foundation models are increasingly trained on sensitive data across domains such as finance, biomedicine, and personal identifiers. However, this distributed setup raises serious privacy concerns due to the need for cross-partition data sharing. Split learning addresses these concerns by enabling collaborative model training without raw data exchange between partitions, yet it introduces a significant challenge: transmitting high-dimensional intermediate feature representations between partitions leads to substantial communication costs. To address this challenge, we propose Quantized-TinyLLaVA, a multimodal foundation model with an integrated communication-efficient split learning framework. Our approach adopts a compression module that quantizes intermediate feature into discrete representations before transmission, substantially reducing communication overhead. Besides, we derive a principled quantization strategy grounded in entropy coding theory to determine the optimal number of discrete representation levels. We deploy our framework in a two-partition setting, with one partition operating as the client and the other as the server, to realistically simulate distributed training. Under this setup, Quantized-TinyLLaVA achieves an approximate \textbf{87.5\%} reduction in communication overhead with 2-bit quantization, while maintaining performance of the original 16-bit model across five benchmark datasets. Furthermore, our compressed representations exhibit enhanced resilience against feature inversion attacks, validating the privacy of transmission. The code is available at https://github.com/anonymous-1742/Quantized-TinyLLaVA.
title	Quantized-Tinyllava: a new multimodal foundation model enables efficient split learning
topic	Machine Learning
url	https://arxiv.org/abs/2511.23402

Similar Items