Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Feng, Yushi, Du, Junye, Wang, Qifan, Ma, Zizhan, Niu, Qian, Matsuo, Yutaka, Feng, Long, Yu, Lequan
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2604.09155
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913021194928128
author	Feng, Yushi Du, Junye Wang, Qifan Ma, Zizhan Niu, Qian Matsuo, Yutaka Feng, Long Yu, Lequan
author_facet	Feng, Yushi Du, Junye Wang, Qifan Ma, Zizhan Niu, Qian Matsuo, Yutaka Feng, Long Yu, Lequan
contents	Graphical user interface (GUI) agents powered by vision language models (VLMs) are rapidly moving from passive assistance to autonomous operation. However, this unrestricted action space exposes users to severe and irreversible financial, privacy or social harm. Existing safeguards rely on prompt engineering, brittle heuristics and VLM-as-critic lack formal verification and user-tunable guarantees. We propose CORA (COnformal Risk-controlled GUI Agent), a post-policy, pre-action safeguarding framework that provides statistical guarantees on harmful executed actions. CORA reformulates safety as selective action execution: we train a Guardian model to estimate action-conditional risk for each proposed step. Rather than thresholding raw scores, we leverage Conformal Risk Control to calibrate an execute/abstain boundary that satisfies a user-specified risk budget and route rejected actions to a trainable Diagnostician model, which performs multimodal reasoning over rejected actions to recommend interventions (e.g., confirm, reflect, or abort) to minimize user burden. A Goal-Lock mechanism anchors assessment to a clarified, frozen user intent to resist visual injection attacks. To rigorously evaluate this paradigm, we introduce Phone-Harm, a new benchmark of mobile safety violations with step-level harm labels under real-world settings. Experiments on Phone-Harm and public benchmarks against diverse baselines validate that CORA improves the safety--helpfulness--interruption Pareto frontier, offering a practical, statistically grounded safety paradigm for autonomous GUI execution. Code and benchmark are available at cora-agent.github.io.
format	Preprint
id	arxiv_https___arxiv_org_abs_2604_09155
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	CORA: Conformal Risk-Controlled Agents for Safeguarded Mobile GUI Automation Feng, Yushi Du, Junye Wang, Qifan Ma, Zizhan Niu, Qian Matsuo, Yutaka Feng, Long Yu, Lequan Machine Learning Artificial Intelligence Graphical user interface (GUI) agents powered by vision language models (VLMs) are rapidly moving from passive assistance to autonomous operation. However, this unrestricted action space exposes users to severe and irreversible financial, privacy or social harm. Existing safeguards rely on prompt engineering, brittle heuristics and VLM-as-critic lack formal verification and user-tunable guarantees. We propose CORA (COnformal Risk-controlled GUI Agent), a post-policy, pre-action safeguarding framework that provides statistical guarantees on harmful executed actions. CORA reformulates safety as selective action execution: we train a Guardian model to estimate action-conditional risk for each proposed step. Rather than thresholding raw scores, we leverage Conformal Risk Control to calibrate an execute/abstain boundary that satisfies a user-specified risk budget and route rejected actions to a trainable Diagnostician model, which performs multimodal reasoning over rejected actions to recommend interventions (e.g., confirm, reflect, or abort) to minimize user burden. A Goal-Lock mechanism anchors assessment to a clarified, frozen user intent to resist visual injection attacks. To rigorously evaluate this paradigm, we introduce Phone-Harm, a new benchmark of mobile safety violations with step-level harm labels under real-world settings. Experiments on Phone-Harm and public benchmarks against diverse baselines validate that CORA improves the safety--helpfulness--interruption Pareto frontier, offering a practical, statistically grounded safety paradigm for autonomous GUI execution. Code and benchmark are available at cora-agent.github.io.
title	CORA: Conformal Risk-Controlled Agents for Safeguarded Mobile GUI Automation
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2604.09155

Similar Items