Saved in:
Bibliographic Details
Main Authors: Mai, Kimberly T., Gausen, Anna, Dubois, Magda, Murad, Mona, O'Dell, Bessie, Staes-Polet, Nadine, Summerfield, Christopher, Strait, Andrew
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.21831
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914366089068544
author Mai, Kimberly T.
Gausen, Anna
Dubois, Magda
Murad, Mona
O'Dell, Bessie
Staes-Polet, Nadine
Summerfield, Christopher
Strait, Andrew
author_facet Mai, Kimberly T.
Gausen, Anna
Dubois, Magda
Murad, Mona
O'Dell, Bessie
Staes-Polet, Nadine
Summerfield, Christopher
Strait, Andrew
contents AI is increasingly being used to assist fraud and cybercrime. However, it is unclear the extent to which current large language models can provide useful information for complex criminal activity. Working with law enforcement and policy experts, we developed multi-turn evaluations for three fraud and cybercrime scenarios (romance scams, CEO impersonation, and identity theft). Our evaluations focus on text-to-text interactions. In each scenario, we evaluate whether models provide actionable assistance beyond information typically available on the web, as assessed by domain experts. We do so in ways designed to resemble real-world misuse, such as breaking down requests for fraud into a sequence of seemingly benign queries. We found that (1) current large language models provide minimal actionable information for fraud and cybercrime without the use of advanced jailbreaking techniques, (2) model safeguards have significant impact on the provision of information, with the two open-weight large language models fine-tuned to remove safety guardrails providing the most actionable and useful responses, and (3) decomposing requests into benign-seeming queries elicited more assistance than explicitly malicious framing or basic system-level jailbreaks. Overall, the results suggest that current text-generation models provide relatively minimal uplift for fraud and cybercrime through information provision, without extensive effort to circumvent safeguards. This work contributes a reproducible, expert-grounded framework for tracking how these risks may evolve with time as models grow more capable and adversaries adapt.
format Preprint
id arxiv_https___arxiv_org_abs_2602_21831
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle A Multi-Turn Framework for Evaluating AI Misuse in Fraud and Cybercrime Scenarios
Mai, Kimberly T.
Gausen, Anna
Dubois, Magda
Murad, Mona
O'Dell, Bessie
Staes-Polet, Nadine
Summerfield, Christopher
Strait, Andrew
Computers and Society
AI is increasingly being used to assist fraud and cybercrime. However, it is unclear the extent to which current large language models can provide useful information for complex criminal activity. Working with law enforcement and policy experts, we developed multi-turn evaluations for three fraud and cybercrime scenarios (romance scams, CEO impersonation, and identity theft). Our evaluations focus on text-to-text interactions. In each scenario, we evaluate whether models provide actionable assistance beyond information typically available on the web, as assessed by domain experts. We do so in ways designed to resemble real-world misuse, such as breaking down requests for fraud into a sequence of seemingly benign queries. We found that (1) current large language models provide minimal actionable information for fraud and cybercrime without the use of advanced jailbreaking techniques, (2) model safeguards have significant impact on the provision of information, with the two open-weight large language models fine-tuned to remove safety guardrails providing the most actionable and useful responses, and (3) decomposing requests into benign-seeming queries elicited more assistance than explicitly malicious framing or basic system-level jailbreaks. Overall, the results suggest that current text-generation models provide relatively minimal uplift for fraud and cybercrime through information provision, without extensive effort to circumvent safeguards. This work contributes a reproducible, expert-grounded framework for tracking how these risks may evolve with time as models grow more capable and adversaries adapt.
title A Multi-Turn Framework for Evaluating AI Misuse in Fraud and Cybercrime Scenarios
topic Computers and Society
url https://arxiv.org/abs/2602.21831