Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Miao, Hongyi, Jia, Jun, Wang, Xincheng, Ma, Qianli, Sun, Wei, Zhou, Wangqiu, Zhu, Dandan, Cao, Yewen, Liu, Zhi, Zhai, Guangtao
Formato:	Preprint
Publicado:	2026
Materias:	Computer Vision and Pattern Recognition
Acceso en línea:	https://arxiv.org/abs/2603.23925
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866911543353933824
author	Miao, Hongyi Jia, Jun Wang, Xincheng Ma, Qianli Sun, Wei Zhou, Wangqiu Zhu, Dandan Cao, Yewen Liu, Zhi Zhai, Guangtao
author_facet	Miao, Hongyi Jia, Jun Wang, Xincheng Ma, Qianli Sun, Wei Zhou, Wangqiu Zhu, Dandan Cao, Yewen Liu, Zhi Zhai, Guangtao
contents	Recent advances in visual-language alignment have endowed vision-language models (VLMs) with fine-grained image understanding capabilities. However, this progress also introduces new privacy risks. This paper first proposes a novel privacy threat model named identity-affiliation learning: an attacker fine-tunes a VLM using only a few private photos of a target individual, thereby embedding associations between the target facial identity and their private property and social relationships into the model's internal representations. Once deployed via public APIs, this model enables unauthorized exposure of the target user's private information upon input of their photos. To benchmark VLMs' susceptibility to such identity-affiliation leakage, we introduce the first identity-affiliation dataset comprising seven typical scenarios appearing in private photos. Each scenario is instantiated with multiple identity-centered photo-description pairs. Experimental results demonstrate that mainstream VLMs like LLaVA, Qwen-VL, and MiniGPT-v2, can recognize facial identities and infer identity-affiliation relationships by fine-tuning on small-scale private photographic dataset, and even on synthetically generated datasets. To mitigate this privacy risk, we propose DP2-VL, the first Dataset Protection framework for private photos that leverages Data Poisoning. Though optimizing imperceptible perturbations by pushing the original representations toward an antithetical region, DP2-VL induces a dataset-level shift in the embedding space of VLMs'encoders. This shift separates protected images from clean inference images, causing fine-tuning on the protected set to overfit. Extensive experiments demonstrate that DP2-VL achieves strong generalization across models, robustness to diverse post-processing operations, and consistent effectiveness across varying protection ratios.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_23925
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	DP^2-VL: Private Photo Dataset Protection by Data Poisoning for Vision-Language Models Miao, Hongyi Jia, Jun Wang, Xincheng Ma, Qianli Sun, Wei Zhou, Wangqiu Zhu, Dandan Cao, Yewen Liu, Zhi Zhai, Guangtao Computer Vision and Pattern Recognition Recent advances in visual-language alignment have endowed vision-language models (VLMs) with fine-grained image understanding capabilities. However, this progress also introduces new privacy risks. This paper first proposes a novel privacy threat model named identity-affiliation learning: an attacker fine-tunes a VLM using only a few private photos of a target individual, thereby embedding associations between the target facial identity and their private property and social relationships into the model's internal representations. Once deployed via public APIs, this model enables unauthorized exposure of the target user's private information upon input of their photos. To benchmark VLMs' susceptibility to such identity-affiliation leakage, we introduce the first identity-affiliation dataset comprising seven typical scenarios appearing in private photos. Each scenario is instantiated with multiple identity-centered photo-description pairs. Experimental results demonstrate that mainstream VLMs like LLaVA, Qwen-VL, and MiniGPT-v2, can recognize facial identities and infer identity-affiliation relationships by fine-tuning on small-scale private photographic dataset, and even on synthetically generated datasets. To mitigate this privacy risk, we propose DP2-VL, the first Dataset Protection framework for private photos that leverages Data Poisoning. Though optimizing imperceptible perturbations by pushing the original representations toward an antithetical region, DP2-VL induces a dataset-level shift in the embedding space of VLMs'encoders. This shift separates protected images from clean inference images, causing fine-tuning on the protected set to overfit. Extensive experiments demonstrate that DP2-VL achieves strong generalization across models, robustness to diverse post-processing operations, and consistent effectiveness across varying protection ratios.
title	DP^2-VL: Private Photo Dataset Protection by Data Poisoning for Vision-Language Models
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2603.23925

Ejemplares similares