Saved in:
| Main Authors: | Zhao, Haoren, Chen, Tianyi, Wang, Zhen |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.04716 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
WinDeskGround: A Benchmark for Robust GUI Grounding in Complex Multi-Window Desktop Environments
by: Zhao, Haoren, et al.
Published: (2026)
by: Zhao, Haoren, et al.
Published: (2026)
POINTS-GUI-G: GUI-Grounding Journey
by: Zhao, Zhongyin, et al.
Published: (2026)
by: Zhao, Zhongyin, et al.
Published: (2026)
SparkUI-Parser: Enhancing GUI Perception with Robust Grounding and Parsing
by: Jing, Hongyi, et al.
Published: (2025)
by: Jing, Hongyi, et al.
Published: (2025)
Generalist Scanner Meets Specialist Locator: A Synergistic Coarse-to-Fine Framework for Robust GUI Grounding
by: Li, Zhecheng, et al.
Published: (2025)
by: Li, Zhecheng, et al.
Published: (2025)
Chain of Attack: On the Robustness of Vision-Language Models Against Transfer-Based Adversarial Attacks
by: Xie, Peng, et al.
Published: (2024)
by: Xie, Peng, et al.
Published: (2024)
AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs
by: Li, Hongxin, et al.
Published: (2025)
by: Li, Hongxin, et al.
Published: (2025)
Phi-Ground Tech Report: Advancing Perception in GUI Grounding
by: Zhang, Miaosen, et al.
Published: (2025)
by: Zhang, Miaosen, et al.
Published: (2025)
Attack Deterministic Conditional Image Generative Models for Diverse and Controllable Generation
by: Chu, Tianyi, et al.
Published: (2024)
by: Chu, Tianyi, et al.
Published: (2024)
GoClick: Lightweight Element Grounding Model for Autonomous GUI Interaction
by: Li, Hongxin, et al.
Published: (2026)
by: Li, Hongxin, et al.
Published: (2026)
Towards GUI Agents: Vision-Language Diffusion Models for GUI Grounding
by: Kumbhar, Shrinidhi, et al.
Published: (2026)
by: Kumbhar, Shrinidhi, et al.
Published: (2026)
GUI-C$^2$: Coarse-to-Fine GUI Grounding via Difficulty-Aware Reinforcement Learning
by: Li, Junlong, et al.
Published: (2026)
by: Li, Junlong, et al.
Published: (2026)
Learning GUI Grounding with Spatial Reasoning from Visual Feedback
by: Zhao, Yu, et al.
Published: (2025)
by: Zhao, Yu, et al.
Published: (2025)
Robust Identity Perceptual Watermark Against Deepfake Face Swapping
by: Wang, Tianyi, et al.
Published: (2023)
by: Wang, Tianyi, et al.
Published: (2023)
Trifuse: Enhancing Attention-Based GUI Grounding via Multimodal Fusion
by: Ma, Longhui, et al.
Published: (2026)
by: Ma, Longhui, et al.
Published: (2026)
GUI-ARP: Enhancing Grounding with Adaptive Region Perception for GUI Agents
by: Ye, Xianhang, et al.
Published: (2025)
by: Ye, Xianhang, et al.
Published: (2025)
How Auxiliary Reasoning Unleashes GUI Grounding in VLMs
by: Li, Weiming, et al.
Published: (2025)
by: Li, Weiming, et al.
Published: (2025)
Robustness of Vision Language Models Against Split-Image Harmful Input Attacks
by: Rashid, Md Rafi Ur, et al.
Published: (2026)
by: Rashid, Md Rafi Ur, et al.
Published: (2026)
Robust Spiking Neural Networks Against Adversarial Attacks
by: Wang, Shuai, et al.
Published: (2026)
by: Wang, Shuai, et al.
Published: (2026)
Attention-driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models without Fine-Tuning
by: Xu, Hai-Ming, et al.
Published: (2024)
by: Xu, Hai-Ming, et al.
Published: (2024)
Unified Prompt Attack Against Text-to-Image Generation Models
by: Peng, Duo, et al.
Published: (2025)
by: Peng, Duo, et al.
Published: (2025)
AdaZoom-GUI: Adaptive Zoom-based GUI Grounding with Instruction Refinement
by: Pei, Siqi, et al.
Published: (2026)
by: Pei, Siqi, et al.
Published: (2026)
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
by: Wu, Qianhui, et al.
Published: (2025)
by: Wu, Qianhui, et al.
Published: (2025)
Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
by: Fan, Yue, et al.
Published: (2024)
by: Fan, Yue, et al.
Published: (2024)
MVP: Multiple View Prediction Improves GUI Grounding
by: Zhang, Yunzhu, et al.
Published: (2025)
by: Zhang, Yunzhu, et al.
Published: (2025)
GUI-G$^2$: Gaussian Reward Modeling for GUI Grounding
by: Tang, Fei, et al.
Published: (2025)
by: Tang, Fei, et al.
Published: (2025)
SIFT-Graph: Benchmarking Multimodal Defense Against Image Adversarial Attacks With Robust Feature Graph
by: He, Jingjie, et al.
Published: (2025)
by: He, Jingjie, et al.
Published: (2025)
GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration
by: Fan, Yue, et al.
Published: (2025)
by: Fan, Yue, et al.
Published: (2025)
R-VLM: Region-Aware Vision Language Model for Precise GUI Grounding
by: Park, Joonhyung, et al.
Published: (2025)
by: Park, Joonhyung, et al.
Published: (2025)
ResGuard: Enhancing Robustness Against Known Original Attacks in Deep Watermarking
by: Wang, Hanyi, et al.
Published: (2026)
by: Wang, Hanyi, et al.
Published: (2026)
Stroke3D: Lifting 2D strokes into rigged 3D model via latent diffusion models
by: Zhao, Ruisi, et al.
Published: (2026)
by: Zhao, Ruisi, et al.
Published: (2026)
Enhancing Trustworthy GUI Grounding via Self-Critiqued Reinforcement Learning
by: Zhang, Shaojie, et al.
Published: (2025)
by: Zhang, Shaojie, et al.
Published: (2025)
Towards Unified Robustness Against Both Backdoor and Adversarial Attacks
by: Niu, Zhenxing, et al.
Published: (2024)
by: Niu, Zhenxing, et al.
Published: (2024)
Reproducibility Study on Adversarial Attacks Against Robust Transformer Trackers
by: Nokabadi, Fatemeh Nourilenjan, et al.
Published: (2024)
by: Nokabadi, Fatemeh Nourilenjan, et al.
Published: (2024)
Robust and Transferable Backdoor Attacks Against Deep Image Compression With Selective Frequency Prior
by: Yu, Yi, et al.
Published: (2024)
by: Yu, Yi, et al.
Published: (2024)
Membership Inference Attack Against Masked Image Modeling
by: Li, Zheng, et al.
Published: (2024)
by: Li, Zheng, et al.
Published: (2024)
UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction-as-Reasoning
by: Chen, Liangyu, et al.
Published: (2025)
by: Chen, Liangyu, et al.
Published: (2025)
Improving GUI Grounding with Explicit Position-to-Coordinate Mapping
by: Wang, Suyuchen, et al.
Published: (2025)
by: Wang, Suyuchen, et al.
Published: (2025)
\textsc{GUI-Spotlight}: Adaptive Iterative Focus Refinement for Enhanced GUI Visual Grounding
by: Lei, Bin, et al.
Published: (2025)
by: Lei, Bin, et al.
Published: (2025)
BAMI: Training-Free Bias Mitigation in GUI Grounding
by: Zhang, Borui, et al.
Published: (2026)
by: Zhang, Borui, et al.
Published: (2026)
Towards Robust Content Watermarking Against Removal and Forgery Attacks
by: Zhu, Yifan, et al.
Published: (2026)
by: Zhu, Yifan, et al.
Published: (2026)
Similar Items
-
WinDeskGround: A Benchmark for Robust GUI Grounding in Complex Multi-Window Desktop Environments
by: Zhao, Haoren, et al.
Published: (2026) -
POINTS-GUI-G: GUI-Grounding Journey
by: Zhao, Zhongyin, et al.
Published: (2026) -
SparkUI-Parser: Enhancing GUI Perception with Robust Grounding and Parsing
by: Jing, Hongyi, et al.
Published: (2025) -
Generalist Scanner Meets Specialist Locator: A Synergistic Coarse-to-Fine Framework for Robust GUI Grounding
by: Li, Zhecheng, et al.
Published: (2025) -
Chain of Attack: On the Robustness of Vision-Language Models Against Transfer-Based Adversarial Attacks
by: Xie, Peng, et al.
Published: (2024)