:: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Liu, Yichao, Shen, Huawen, Yu, Liu, Liu, Shiyu, Chen, Zeyu, Zhou, Yu
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.15542
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Learn where to Click from Yourself: On-Policy Self-Distillation for GUI Grounding
by: Zhang, Yan, et al.
Published: (2026)

GUI-Eyes: Tool-Augmented Perception for Visual Grounding in GUI Agents
by: Chen, Chen, et al.
Published: (2026)

GUI-ARP: Enhancing Grounding with Adaptive Region Perception for GUI Agents
by: Ye, Xianhang, et al.
Published: (2025)

InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization
by: Liu, Yuhang, et al.
Published: (2025)

BAMI: Training-Free Bias Mitigation in GUI Grounding
by: Zhang, Borui, et al.
Published: (2026)

GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
by: Wu, Qianhui, et al.
Published: (2025)

GUI-G$^2$: Gaussian Reward Modeling for GUI Grounding
by: Tang, Fei, et al.
Published: (2025)

GUI-PRA: Process Reward Agent for GUI Tasks
by: Xiong, Tao, et al.
Published: (2025)

GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration
by: Fan, Yue, et al.
Published: (2025)

GUI-Rise: Structured Reasoning and History Summarization for GUI Navigation
by: Liu, Tao, et al.
Published: (2025)

OmniGUI: Benchmarking GUI Agents in Omni-Modal Smartphone Environments
by: Henry, Felix, et al.
Published: (2026)

GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents
by: Zhou, Yuqi, et al.
Published: (2025)

GUI-Perturbed: Domain Randomization Reveals Systematic Brittleness in GUI Grounding Models
by: Wang, Yangyue, et al.
Published: (2026)

GUI-Shepherd: Reliable Process Reward and Verification for Long-Sequence GUI Tasks
by: Chen, Cong, et al.
Published: (2025)

VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI Automation
by: Han, Qijun, et al.
Published: (2026)

GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent
by: Xie, Bin, et al.
Published: (2025)

GUI Knowledge Bench: Revealing the Knowledge Gap of VLMs in GUI Tasks
by: Shi, Chenrui, et al.
Published: (2025)

GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior
by: Wu, Penghao, et al.
Published: (2025)

\textsc{GUI-Spotlight}: Adaptive Iterative Focus Refinement for Enhanced GUI Visual Grounding
by: Lei, Bin, et al.
Published: (2025)

Chain-of-Ground: Improving GUI Grounding via Iterative Reasoning and Reference Feedback
by: Li, Aiden Yiliu, et al.
Published: (2025)

ZeroGUI: Automating Online GUI Learning at Zero Human Cost
by: Yang, Chenyu, et al.
Published: (2025)

META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI
by: Sun, Liangtai, et al.
Published: (2022)

SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
by: Cheng, Kanzhi, et al.
Published: (2024)

GUI-GENESIS: Automated Synthesis of Efficient Environments with Verifiable Rewards for GUI Agent Post-Training
by: Cao, Yuan, et al.
Published: (2026)

Towards GUI Agents: Vision-Language Diffusion Models for GUI Grounding
by: Kumbhar, Shrinidhi, et al.
Published: (2026)

AdaZoom-GUI: Adaptive Zoom-based GUI Grounding with Instruction Refinement
by: Pei, Siqi, et al.
Published: (2026)

UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction-as-Reasoning
by: Chen, Liangyu, et al.
Published: (2025)

Test-Time Reinforcement Learning for GUI Grounding via Region Consistency
by: Du, Yong, et al.
Published: (2025)

CRAFT-GUI: Curriculum-Reinforced Agent For GUI Tasks
by: Nong, Songqin, et al.
Published: (2025)

DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning
by: Wu, Hang, et al.
Published: (2025)

MPR-GUI: Benchmarking and Enhancing Multilingual Perception and Reasoning in GUI Agents
by: Chen, Ruihan, et al.
Published: (2025)

Improving GUI Grounding with Explicit Position-to-Coordinate Mapping
by: Wang, Suyuchen, et al.
Published: (2025)

Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization
by: Zhu, Jiachen, et al.
Published: (2026)

ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents
by: Tang, Fei, et al.
Published: (2026)

GUI Testing Arena: A Unified Benchmark for Advancing Autonomous GUI Testing Agent
by: Zhao, Kangjia, et al.
Published: (2024)

GUI-AIMA: Aligning Intrinsic Multimodal Attention with a Context Anchor for GUI Grounding
by: Zhou, Shijie, et al.
Published: (2025)

Aria-UI: Visual Grounding for GUI Instructions
by: Yang, Yuhao, et al.
Published: (2024)

GUI-Shift: Enhancing VLM-Based GUI Agents through Self-supervised Reinforcement Learning
by: Gao, Longxi, et al.
Published: (2025)

GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration
by: Sun, Yuchen, et al.
Published: (2025)

LiteGUI: Distilling Compact GUI Agents with Reinforcement Learning
by: Wu, Yubin, et al.
Published: (2026)