Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Mai, Kimberly T., Gausen, Anna, Dubois, Magda, Murad, Mona, O'Dell, Bessie, Staes-Polet, Nadine, Summerfield, Christopher, Strait, Andrew
Format:	Preprint
Published:	2026
Subjects:	Computers and Society
Online Access:	https://arxiv.org/abs/2602.21831
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914366089068544
author	Mai, Kimberly T. Gausen, Anna Dubois, Magda Murad, Mona O'Dell, Bessie Staes-Polet, Nadine Summerfield, Christopher Strait, Andrew
author_facet	Mai, Kimberly T. Gausen, Anna Dubois, Magda Murad, Mona O'Dell, Bessie Staes-Polet, Nadine Summerfield, Christopher Strait, Andrew
contents	AI is increasingly being used to assist fraud and cybercrime. However, it is unclear the extent to which current large language models can provide useful information for complex criminal activity. Working with law enforcement and policy experts, we developed multi-turn evaluations for three fraud and cybercrime scenarios (romance scams, CEO impersonation, and identity theft). Our evaluations focus on text-to-text interactions. In each scenario, we evaluate whether models provide actionable assistance beyond information typically available on the web, as assessed by domain experts. We do so in ways designed to resemble real-world misuse, such as breaking down requests for fraud into a sequence of seemingly benign queries. We found that (1) current large language models provide minimal actionable information for fraud and cybercrime without the use of advanced jailbreaking techniques, (2) model safeguards have significant impact on the provision of information, with the two open-weight large language models fine-tuned to remove safety guardrails providing the most actionable and useful responses, and (3) decomposing requests into benign-seeming queries elicited more assistance than explicitly malicious framing or basic system-level jailbreaks. Overall, the results suggest that current text-generation models provide relatively minimal uplift for fraud and cybercrime through information provision, without extensive effort to circumvent safeguards. This work contributes a reproducible, expert-grounded framework for tracking how these risks may evolve with time as models grow more capable and adversaries adapt.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_21831
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	A Multi-Turn Framework for Evaluating AI Misuse in Fraud and Cybercrime Scenarios Mai, Kimberly T. Gausen, Anna Dubois, Magda Murad, Mona O'Dell, Bessie Staes-Polet, Nadine Summerfield, Christopher Strait, Andrew Computers and Society AI is increasingly being used to assist fraud and cybercrime. However, it is unclear the extent to which current large language models can provide useful information for complex criminal activity. Working with law enforcement and policy experts, we developed multi-turn evaluations for three fraud and cybercrime scenarios (romance scams, CEO impersonation, and identity theft). Our evaluations focus on text-to-text interactions. In each scenario, we evaluate whether models provide actionable assistance beyond information typically available on the web, as assessed by domain experts. We do so in ways designed to resemble real-world misuse, such as breaking down requests for fraud into a sequence of seemingly benign queries. We found that (1) current large language models provide minimal actionable information for fraud and cybercrime without the use of advanced jailbreaking techniques, (2) model safeguards have significant impact on the provision of information, with the two open-weight large language models fine-tuned to remove safety guardrails providing the most actionable and useful responses, and (3) decomposing requests into benign-seeming queries elicited more assistance than explicitly malicious framing or basic system-level jailbreaks. Overall, the results suggest that current text-generation models provide relatively minimal uplift for fraud and cybercrime through information provision, without extensive effort to circumvent safeguards. This work contributes a reproducible, expert-grounded framework for tracking how these risks may evolve with time as models grow more capable and adversaries adapt.
title	A Multi-Turn Framework for Evaluating AI Misuse in Fraud and Cybercrime Scenarios
topic	Computers and Society
url	https://arxiv.org/abs/2602.21831

Similar Items