Saved in:
Bibliographic Details
Main Authors: Yang, Yulong, Yang, Xinshan, Li, Shuaidong, Lin, Chenhao, Zhao, Zhengyu, Shen, Chao, Zhang, Tianwei
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2407.09295
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917957452431360
author Yang, Yulong
Yang, Xinshan
Li, Shuaidong
Lin, Chenhao
Zhao, Zhengyu
Shen, Chao
Zhang, Tianwei
author_facet Yang, Yulong
Yang, Xinshan
Li, Shuaidong
Lin, Chenhao
Zhao, Zhengyu
Shen, Chao
Zhang, Tianwei
contents The integration of Large Language Models (LLMs) and Multi-modal Large Language Models (MLLMs) into mobile GUI agents has significantly enhanced user efficiency and experience. However, this advancement also introduces potential security vulnerabilities that have yet to be thoroughly explored. In this paper, we present a systematic security investigation of multi-modal mobile GUI agents, addressing this critical gap in the existing literature. Our contributions are twofold: (1) we propose a novel threat modeling methodology, leading to the discovery and feasibility analysis of 34 previously unreported attacks, and (2) we design an attack framework to systematically construct and evaluate these threats. Through a combination of real-world case studies and extensive dataset-driven experiments, we validate the severity and practicality of those attacks, highlighting the pressing need for robust security measures in mobile GUI systems.
format Preprint
id arxiv_https___arxiv_org_abs_2407_09295
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents
Yang, Yulong
Yang, Xinshan
Li, Shuaidong
Lin, Chenhao
Zhao, Zhengyu
Shen, Chao
Zhang, Tianwei
Cryptography and Security
The integration of Large Language Models (LLMs) and Multi-modal Large Language Models (MLLMs) into mobile GUI agents has significantly enhanced user efficiency and experience. However, this advancement also introduces potential security vulnerabilities that have yet to be thoroughly explored. In this paper, we present a systematic security investigation of multi-modal mobile GUI agents, addressing this critical gap in the existing literature. Our contributions are twofold: (1) we propose a novel threat modeling methodology, leading to the discovery and feasibility analysis of 34 previously unreported attacks, and (2) we design an attack framework to systematically construct and evaluate these threats. Through a combination of real-world case studies and extensive dataset-driven experiments, we validate the severity and practicality of those attacks, highlighting the pressing need for robust security measures in mobile GUI systems.
title Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents
topic Cryptography and Security
url https://arxiv.org/abs/2407.09295