Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Li, Ziwen, Wen, Jianing, Li, Tianshi
Format:	Preprint
Published:	2026
Subjects:	Cryptography and Security Computation and Language
Online Access:	https://arxiv.org/abs/2605.30848
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911741032529920
author	Li, Ziwen Wen, Jianing Li, Tianshi
author_facet	Li, Ziwen Wen, Jianing Li, Tianshi
contents	Agentic LLMs with web search change the threat model for text anonymization: weak contextual cues can become cross-referenceable evidence for re-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text for formal privacy, or test rewritten text against non-web inference models, leaving underexplored the operating region between resistance to agentic web-search re-identification and utility retention. We introduce AURA (\textbf{A}nonymization with \textbf{U}tility-\textbf{R}etention \textbf{A}daptation), an LLM-powered \textit{mask-reconstruct} framework that decouples privacy localization from utility-preserving reconstruction and selects candidates with adversarial privacy and utility-retention checks. We evaluate AURA on real-user interview transcripts using re-identification attacks carried out by web-search agents, along with a utility evaluation based on interviewee-profile facts, codebook facts, and the joint contextual utility grid. Our results show that AURA improves the privacy-utility frontier by using adaptive privacy scope to strengthen resistance to agentic re-identification and using a mask-reconstruct anonymization method to better preserve contextual utility under fixed privacy scope.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_30848
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	LLM Anonymization Against Agentic Re-Identification Li, Ziwen Wen, Jianing Li, Tianshi Cryptography and Security Computation and Language Agentic LLMs with web search change the threat model for text anonymization: weak contextual cues can become cross-referenceable evidence for re-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text for formal privacy, or test rewritten text against non-web inference models, leaving underexplored the operating region between resistance to agentic web-search re-identification and utility retention. We introduce AURA (\textbf{A}nonymization with \textbf{U}tility-\textbf{R}etention \textbf{A}daptation), an LLM-powered \textit{mask-reconstruct} framework that decouples privacy localization from utility-preserving reconstruction and selects candidates with adversarial privacy and utility-retention checks. We evaluate AURA on real-user interview transcripts using re-identification attacks carried out by web-search agents, along with a utility evaluation based on interviewee-profile facts, codebook facts, and the joint contextual utility grid. Our results show that AURA improves the privacy-utility frontier by using adaptive privacy scope to strengthen resistance to agentic re-identification and using a mask-reconstruct anonymization method to better preserve contextual utility under fixed privacy scope.
title	LLM Anonymization Against Agentic Re-Identification
topic	Cryptography and Security Computation and Language
url	https://arxiv.org/abs/2605.30848

Similar Items