Saved in:
Bibliographic Details
Main Authors: Li, Ang, Zhou, Yin, Raghuram, Vethavikashini Chithrra, Goldstein, Tom, Goldblum, Micah
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.08586
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913688417468416
author Li, Ang
Zhou, Yin
Raghuram, Vethavikashini Chithrra
Goldstein, Tom
Goldblum, Micah
author_facet Li, Ang
Zhou, Yin
Raghuram, Vethavikashini Chithrra
Goldstein, Tom
Goldblum, Micah
contents A high volume of recent ML security literature focuses on attacks against aligned large language models (LLMs). These attacks may extract private information or coerce the model into producing harmful outputs. In real-world deployments, LLMs are often part of a larger agentic pipeline including memory systems, retrieval, web access, and API calling. Such additional components introduce vulnerabilities that make these LLM-powered agents much easier to attack than isolated LLMs, yet relatively little work focuses on the security of LLM agents. In this paper, we analyze security and privacy vulnerabilities that are unique to LLM agents. We first provide a taxonomy of attacks categorized by threat actors, objectives, entry points, attacker observability, attack strategies, and inherent vulnerabilities of agent pipelines. We then conduct a series of illustrative attacks on popular open-source and commercial agents, demonstrating the immediate practical implications of their vulnerabilities. Notably, our attacks are trivial to implement and require no understanding of machine learning.
format Preprint
id arxiv_https___arxiv_org_abs_2502_08586
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks
Li, Ang
Zhou, Yin
Raghuram, Vethavikashini Chithrra
Goldstein, Tom
Goldblum, Micah
Machine Learning
Artificial Intelligence
A high volume of recent ML security literature focuses on attacks against aligned large language models (LLMs). These attacks may extract private information or coerce the model into producing harmful outputs. In real-world deployments, LLMs are often part of a larger agentic pipeline including memory systems, retrieval, web access, and API calling. Such additional components introduce vulnerabilities that make these LLM-powered agents much easier to attack than isolated LLMs, yet relatively little work focuses on the security of LLM agents. In this paper, we analyze security and privacy vulnerabilities that are unique to LLM agents. We first provide a taxonomy of attacks categorized by threat actors, objectives, entry points, attacker observability, attack strategies, and inherent vulnerabilities of agent pipelines. We then conduct a series of illustrative attacks on popular open-source and commercial agents, demonstrating the immediate practical implications of their vulnerabilities. Notably, our attacks are trivial to implement and require no understanding of machine learning.
title Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2502.08586