Saved in:
Bibliographic Details
Main Authors: Wang, Qingni, Fan, Yue, Wang, Xin Eric
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.02419
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915769843974144
author Wang, Qingni
Fan, Yue
Wang, Xin Eric
author_facet Wang, Qingni
Fan, Yue
Wang, Xin Eric
contents Graphical User Interface (GUI) grounding aims to translate natural language instructions into executable screen coordinates, enabling automated GUI interaction. Nevertheless, incorrect grounding can result in costly, hard-to-reverse actions (e.g., erroneous payment approvals), raising concerns about model reliability. In this paper, we introduce SafeGround, an uncertainty-aware framework for GUI grounding models that enables risk-aware predictions through calibrations before testing. SafeGround leverages a distribution-aware uncertainty quantification method to capture the spatial dispersion of stochastic samples from outputs of any given model. Then, through the calibration process, SafeGround derives a test-time decision threshold with statistically guaranteed false discovery rate (FDR) control. We apply SafeGround on multiple GUI grounding models for the challenging ScreenSpot-Pro benchmark. Experimental results show that our uncertainty measure consistently outperforms existing baselines in distinguishing correct from incorrect predictions, while the calibrated threshold reliably enables rigorous risk control and potentials of substantial system-level accuracy improvements. Across multiple GUI grounding models, SafeGround improves system-level accuracy by up to 5.38% percentage points over Gemini-only inference.
format Preprint
id arxiv_https___arxiv_org_abs_2602_02419
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle SafeGround: Know When to Trust GUI Grounding Models via Uncertainty Calibration
Wang, Qingni
Fan, Yue
Wang, Xin Eric
Artificial Intelligence
Software Engineering
Graphical User Interface (GUI) grounding aims to translate natural language instructions into executable screen coordinates, enabling automated GUI interaction. Nevertheless, incorrect grounding can result in costly, hard-to-reverse actions (e.g., erroneous payment approvals), raising concerns about model reliability. In this paper, we introduce SafeGround, an uncertainty-aware framework for GUI grounding models that enables risk-aware predictions through calibrations before testing. SafeGround leverages a distribution-aware uncertainty quantification method to capture the spatial dispersion of stochastic samples from outputs of any given model. Then, through the calibration process, SafeGround derives a test-time decision threshold with statistically guaranteed false discovery rate (FDR) control. We apply SafeGround on multiple GUI grounding models for the challenging ScreenSpot-Pro benchmark. Experimental results show that our uncertainty measure consistently outperforms existing baselines in distinguishing correct from incorrect predictions, while the calibrated threshold reliably enables rigorous risk control and potentials of substantial system-level accuracy improvements. Across multiple GUI grounding models, SafeGround improves system-level accuracy by up to 5.38% percentage points over Gemini-only inference.
title SafeGround: Know When to Trust GUI Grounding Models via Uncertainty Calibration
topic Artificial Intelligence
Software Engineering
url https://arxiv.org/abs/2602.02419