Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Meng, Wenlong, Guo, Zhenyuan, Wu, Lenan, Gong, Chen, Liu, Wenyan, Li, Weixian, Wei, Chengkun, Chen, Wenzhi
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2502.12658
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918051810639872
author	Meng, Wenlong Guo, Zhenyuan Wu, Lenan Gong, Chen Liu, Wenyan Li, Weixian Wei, Chengkun Chen, Wenzhi
author_facet	Meng, Wenlong Guo, Zhenyuan Wu, Lenan Gong, Chen Liu, Wenyan Li, Weixian Wei, Chengkun Chen, Wenzhi
contents	Large Language Models (LLMs) pose significant privacy risks, potentially leaking training data due to implicit memorization. Existing privacy attacks primarily focus on membership inference attacks (MIAs) or data extraction attacks, but reconstructing specific personally identifiable information (PII) in LLMs' training data remains challenging. In this paper, we propose R.R. (Recollect and Rank), a novel two-step privacy stealing attack that enables attackers to reconstruct PII entities from scrubbed training data where the PII entities have been masked. In the first stage, we introduce a prompt paradigm named recollection, which instructs the LLM to repeat a masked text but fill in masks. Then we can use PII identifiers to extract recollected PII candidates. In the second stage, we design a new criterion to score each PII candidate and rank them. Motivated by membership inference, we leverage the reference model as a calibration to our criterion. Experiments across three popular PII datasets demonstrate that the R.R. achieves better PII identification performance than baselines. These results highlight the vulnerability of LLMs to PII leakage even when training data has been scrubbed. We release our code and datasets at GitHub.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_12658
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	R.R.: Unveiling LLM Training Privacy through Recollection and Ranking Meng, Wenlong Guo, Zhenyuan Wu, Lenan Gong, Chen Liu, Wenyan Li, Weixian Wei, Chengkun Chen, Wenzhi Computation and Language Large Language Models (LLMs) pose significant privacy risks, potentially leaking training data due to implicit memorization. Existing privacy attacks primarily focus on membership inference attacks (MIAs) or data extraction attacks, but reconstructing specific personally identifiable information (PII) in LLMs' training data remains challenging. In this paper, we propose R.R. (Recollect and Rank), a novel two-step privacy stealing attack that enables attackers to reconstruct PII entities from scrubbed training data where the PII entities have been masked. In the first stage, we introduce a prompt paradigm named recollection, which instructs the LLM to repeat a masked text but fill in masks. Then we can use PII identifiers to extract recollected PII candidates. In the second stage, we design a new criterion to score each PII candidate and rank them. Motivated by membership inference, we leverage the reference model as a calibration to our criterion. Experiments across three popular PII datasets demonstrate that the R.R. achieves better PII identification performance than baselines. These results highlight the vulnerability of LLMs to PII leakage even when training data has been scrubbed. We release our code and datasets at GitHub.
title	R.R.: Unveiling LLM Training Privacy through Recollection and Ranking
topic	Computation and Language
url	https://arxiv.org/abs/2502.12658

Similar Items