Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhao, Haoren, Chen, Tianyi, Wang, Zhen
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2504.04716
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909568691339264
author	Zhao, Haoren Chen, Tianyi Wang, Zhen
author_facet	Zhao, Haoren Chen, Tianyi Wang, Zhen
contents	Graphical User Interface (GUI) grounding models are crucial for enabling intelligent agents to understand and interact with complex visual interfaces. However, these models face significant robustness challenges in real-world scenarios due to natural noise and adversarial perturbations, and their robustness remains underexplored. In this study, we systematically evaluate the robustness of state-of-the-art GUI grounding models, such as UGround, under three conditions: natural noise, untargeted adversarial attacks, and targeted adversarial attacks. Our experiments, which were conducted across a wide range of GUI environments, including mobile, desktop, and web interfaces, have clearly demonstrated that GUI grounding models exhibit a high degree of sensitivity to adversarial perturbations and low-resolution conditions. These findings provide valuable insights into the vulnerabilities of GUI grounding models and establish a strong benchmark for future research aimed at enhancing their robustness in practical applications. Our code is available at https://github.com/ZZZhr-1/Robust_GUI_Grounding.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_04716
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	On the Robustness of GUI Grounding Models Against Image Attacks Zhao, Haoren Chen, Tianyi Wang, Zhen Computer Vision and Pattern Recognition Graphical User Interface (GUI) grounding models are crucial for enabling intelligent agents to understand and interact with complex visual interfaces. However, these models face significant robustness challenges in real-world scenarios due to natural noise and adversarial perturbations, and their robustness remains underexplored. In this study, we systematically evaluate the robustness of state-of-the-art GUI grounding models, such as UGround, under three conditions: natural noise, untargeted adversarial attacks, and targeted adversarial attacks. Our experiments, which were conducted across a wide range of GUI environments, including mobile, desktop, and web interfaces, have clearly demonstrated that GUI grounding models exhibit a high degree of sensitivity to adversarial perturbations and low-resolution conditions. These findings provide valuable insights into the vulnerabilities of GUI grounding models and establish a strong benchmark for future research aimed at enhancing their robustness in practical applications. Our code is available at https://github.com/ZZZhr-1/Robust_GUI_Grounding.
title	On the Robustness of GUI Grounding Models Against Image Attacks
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2504.04716

Similar Items