Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Shi, Quan, Jimenez, Carlos E., Dong, Stephen, Seo, Brian, Yao, Caden, Kelch, Adam, Narasimhan, Karthik
Formato:	Preprint
Publicado:	2025
Materias:	Computation and Language Artificial Intelligence Human-Computer Interaction
Acceso en línea:	https://arxiv.org/abs/2504.04332
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866912410623803392
author	Shi, Quan Jimenez, Carlos E. Dong, Stephen Seo, Brian Yao, Caden Kelch, Adam Narasimhan, Karthik
author_facet	Shi, Quan Jimenez, Carlos E. Dong, Stephen Seo, Brian Yao, Caden Kelch, Adam Narasimhan, Karthik
contents	As language models achieve increasingly human-like capabilities in conversational text generation, a critical question emerges: to what extent can these systems simulate the characteristics of specific individuals? To evaluate this, we introduce IMPersona, a framework for evaluating LMs at impersonating specific individuals' writing style and personal knowledge. Using supervised fine-tuning and a hierarchical memory-inspired retrieval system, we demonstrate that even modestly sized open-source models, such as Llama-3.1-8B-Instruct, can achieve impersonation abilities at concerning levels. In blind conversation experiments, participants (mis)identified our fine-tuned models with memory integration as human in 44.44% of interactions, compared to just 25.00% for the best prompting-based approach. We analyze these results to propose detection methods and defense strategies against such impersonation attempts. Our findings raise important questions about both the potential applications and risks of personalized language models, particularly regarding privacy, security, and the ethical deployment of such technologies in real-world contexts.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_04332
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	IMPersona: Evaluating Individual Level LM Impersonation Shi, Quan Jimenez, Carlos E. Dong, Stephen Seo, Brian Yao, Caden Kelch, Adam Narasimhan, Karthik Computation and Language Artificial Intelligence Human-Computer Interaction As language models achieve increasingly human-like capabilities in conversational text generation, a critical question emerges: to what extent can these systems simulate the characteristics of specific individuals? To evaluate this, we introduce IMPersona, a framework for evaluating LMs at impersonating specific individuals' writing style and personal knowledge. Using supervised fine-tuning and a hierarchical memory-inspired retrieval system, we demonstrate that even modestly sized open-source models, such as Llama-3.1-8B-Instruct, can achieve impersonation abilities at concerning levels. In blind conversation experiments, participants (mis)identified our fine-tuned models with memory integration as human in 44.44% of interactions, compared to just 25.00% for the best prompting-based approach. We analyze these results to propose detection methods and defense strategies against such impersonation attempts. Our findings raise important questions about both the potential applications and risks of personalized language models, particularly regarding privacy, security, and the ethical deployment of such technologies in real-world contexts.
title	IMPersona: Evaluating Individual Level LM Impersonation
topic	Computation and Language Artificial Intelligence Human-Computer Interaction
url	https://arxiv.org/abs/2504.04332

Ejemplares similares