Saved in:
| Main Authors: | Liu, Yichao, Shen, Huawen, Yu, Liu, Liu, Shiyu, Chen, Zeyu, Zhou, Yu |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.15542 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Learn where to Click from Yourself: On-Policy Self-Distillation for GUI Grounding
by: Zhang, Yan, et al.
Published: (2026)
by: Zhang, Yan, et al.
Published: (2026)
GUI-Eyes: Tool-Augmented Perception for Visual Grounding in GUI Agents
by: Chen, Chen, et al.
Published: (2026)
by: Chen, Chen, et al.
Published: (2026)
GUI-ARP: Enhancing Grounding with Adaptive Region Perception for GUI Agents
by: Ye, Xianhang, et al.
Published: (2025)
by: Ye, Xianhang, et al.
Published: (2025)
InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization
by: Liu, Yuhang, et al.
Published: (2025)
by: Liu, Yuhang, et al.
Published: (2025)
BAMI: Training-Free Bias Mitigation in GUI Grounding
by: Zhang, Borui, et al.
Published: (2026)
by: Zhang, Borui, et al.
Published: (2026)
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
by: Wu, Qianhui, et al.
Published: (2025)
by: Wu, Qianhui, et al.
Published: (2025)
GUI-G$^2$: Gaussian Reward Modeling for GUI Grounding
by: Tang, Fei, et al.
Published: (2025)
by: Tang, Fei, et al.
Published: (2025)
GUI-PRA: Process Reward Agent for GUI Tasks
by: Xiong, Tao, et al.
Published: (2025)
by: Xiong, Tao, et al.
Published: (2025)
GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration
by: Fan, Yue, et al.
Published: (2025)
by: Fan, Yue, et al.
Published: (2025)
GUI-Rise: Structured Reasoning and History Summarization for GUI Navigation
by: Liu, Tao, et al.
Published: (2025)
by: Liu, Tao, et al.
Published: (2025)
OmniGUI: Benchmarking GUI Agents in Omni-Modal Smartphone Environments
by: Henry, Felix, et al.
Published: (2026)
by: Henry, Felix, et al.
Published: (2026)
GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents
by: Zhou, Yuqi, et al.
Published: (2025)
by: Zhou, Yuqi, et al.
Published: (2025)
GUI-Perturbed: Domain Randomization Reveals Systematic Brittleness in GUI Grounding Models
by: Wang, Yangyue, et al.
Published: (2026)
by: Wang, Yangyue, et al.
Published: (2026)
GUI-Shepherd: Reliable Process Reward and Verification for Long-Sequence GUI Tasks
by: Chen, Cong, et al.
Published: (2025)
by: Chen, Cong, et al.
Published: (2025)
VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI Automation
by: Han, Qijun, et al.
Published: (2026)
by: Han, Qijun, et al.
Published: (2026)
GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent
by: Xie, Bin, et al.
Published: (2025)
by: Xie, Bin, et al.
Published: (2025)
GUI Knowledge Bench: Revealing the Knowledge Gap of VLMs in GUI Tasks
by: Shi, Chenrui, et al.
Published: (2025)
by: Shi, Chenrui, et al.
Published: (2025)
GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior
by: Wu, Penghao, et al.
Published: (2025)
by: Wu, Penghao, et al.
Published: (2025)
\textsc{GUI-Spotlight}: Adaptive Iterative Focus Refinement for Enhanced GUI Visual Grounding
by: Lei, Bin, et al.
Published: (2025)
by: Lei, Bin, et al.
Published: (2025)
Chain-of-Ground: Improving GUI Grounding via Iterative Reasoning and Reference Feedback
by: Li, Aiden Yiliu, et al.
Published: (2025)
by: Li, Aiden Yiliu, et al.
Published: (2025)
ZeroGUI: Automating Online GUI Learning at Zero Human Cost
by: Yang, Chenyu, et al.
Published: (2025)
by: Yang, Chenyu, et al.
Published: (2025)
META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI
by: Sun, Liangtai, et al.
Published: (2022)
by: Sun, Liangtai, et al.
Published: (2022)
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
by: Cheng, Kanzhi, et al.
Published: (2024)
by: Cheng, Kanzhi, et al.
Published: (2024)
GUI-GENESIS: Automated Synthesis of Efficient Environments with Verifiable Rewards for GUI Agent Post-Training
by: Cao, Yuan, et al.
Published: (2026)
by: Cao, Yuan, et al.
Published: (2026)
Towards GUI Agents: Vision-Language Diffusion Models for GUI Grounding
by: Kumbhar, Shrinidhi, et al.
Published: (2026)
by: Kumbhar, Shrinidhi, et al.
Published: (2026)
AdaZoom-GUI: Adaptive Zoom-based GUI Grounding with Instruction Refinement
by: Pei, Siqi, et al.
Published: (2026)
by: Pei, Siqi, et al.
Published: (2026)
UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction-as-Reasoning
by: Chen, Liangyu, et al.
Published: (2025)
by: Chen, Liangyu, et al.
Published: (2025)
Test-Time Reinforcement Learning for GUI Grounding via Region Consistency
by: Du, Yong, et al.
Published: (2025)
by: Du, Yong, et al.
Published: (2025)
CRAFT-GUI: Curriculum-Reinforced Agent For GUI Tasks
by: Nong, Songqin, et al.
Published: (2025)
by: Nong, Songqin, et al.
Published: (2025)
DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning
by: Wu, Hang, et al.
Published: (2025)
by: Wu, Hang, et al.
Published: (2025)
MPR-GUI: Benchmarking and Enhancing Multilingual Perception and Reasoning in GUI Agents
by: Chen, Ruihan, et al.
Published: (2025)
by: Chen, Ruihan, et al.
Published: (2025)
Improving GUI Grounding with Explicit Position-to-Coordinate Mapping
by: Wang, Suyuchen, et al.
Published: (2025)
by: Wang, Suyuchen, et al.
Published: (2025)
Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization
by: Zhu, Jiachen, et al.
Published: (2026)
by: Zhu, Jiachen, et al.
Published: (2026)
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents
by: Tang, Fei, et al.
Published: (2026)
by: Tang, Fei, et al.
Published: (2026)
GUI Testing Arena: A Unified Benchmark for Advancing Autonomous GUI Testing Agent
by: Zhao, Kangjia, et al.
Published: (2024)
by: Zhao, Kangjia, et al.
Published: (2024)
GUI-AIMA: Aligning Intrinsic Multimodal Attention with a Context Anchor for GUI Grounding
by: Zhou, Shijie, et al.
Published: (2025)
by: Zhou, Shijie, et al.
Published: (2025)
Aria-UI: Visual Grounding for GUI Instructions
by: Yang, Yuhao, et al.
Published: (2024)
by: Yang, Yuhao, et al.
Published: (2024)
GUI-Shift: Enhancing VLM-Based GUI Agents through Self-supervised Reinforcement Learning
by: Gao, Longxi, et al.
Published: (2025)
by: Gao, Longxi, et al.
Published: (2025)
GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration
by: Sun, Yuchen, et al.
Published: (2025)
by: Sun, Yuchen, et al.
Published: (2025)
LiteGUI: Distilling Compact GUI Agents with Reinforcement Learning
by: Wu, Yubin, et al.
Published: (2026)
by: Wu, Yubin, et al.
Published: (2026)
Similar Items
-
Learn where to Click from Yourself: On-Policy Self-Distillation for GUI Grounding
by: Zhang, Yan, et al.
Published: (2026) -
GUI-Eyes: Tool-Augmented Perception for Visual Grounding in GUI Agents
by: Chen, Chen, et al.
Published: (2026) -
GUI-ARP: Enhancing Grounding with Adaptive Region Perception for GUI Agents
by: Ye, Xianhang, et al.
Published: (2025) -
InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization
by: Liu, Yuhang, et al.
Published: (2025) -
BAMI: Training-Free Bias Mitigation in GUI Grounding
by: Zhang, Borui, et al.
Published: (2026)