Saved in:
| Main Authors: | Shakeel, Rozain, Ali, Abdul Rahman Mohammad, Mushtaq, Muneeb, Saleem, Tausifa Jan, Ashraf, Tajamul |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.19993 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Deep Learning-Based Automated Segmentation of Uterine Myomas
by: Saleem, Tausifa Jan, et al.
Published: (2025)
by: Saleem, Tausifa Jan, et al.
Published: (2025)
MOTOR: Multimodal Optimal Transport via Grounded Retrieval in Medical Visual Question Answering
by: Shaaban, Mai A., et al.
Published: (2025)
by: Shaaban, Mai A., et al.
Published: (2025)
Knowledge Distillation in Vision Transformers: A Critical Review
by: Habib, Gousia, et al.
Published: (2023)
by: Habib, Gousia, et al.
Published: (2023)
Context Aware Grounded Teacher for Source Free Object Detection
by: Ashraf, Tajamul, et al.
Published: (2025)
by: Ashraf, Tajamul, et al.
Published: (2025)
TITAN: Query-Token based Domain Adaptive Adversarial Learning
by: Ashraf, Tajamul, et al.
Published: (2025)
by: Ashraf, Tajamul, et al.
Published: (2025)
FATE: Focal-modulated Attention Encoder for Multivariate Time-series Forecasting
by: Ashraf, Tajamul, et al.
Published: (2024)
by: Ashraf, Tajamul, et al.
Published: (2024)
HF-Fed: Hierarchical based customized Federated Learning Framework for X-Ray Imaging
by: Ashraf, Tajamul, et al.
Published: (2024)
by: Ashraf, Tajamul, et al.
Published: (2024)
Generalizable Federated Learning using Client Adaptive Focal Modulation
by: Ashraf, Tajamul, et al.
Published: (2025)
by: Ashraf, Tajamul, et al.
Published: (2025)
GroundedSurg: A Multi-Procedure Benchmark for Language-Conditioned Surgical Tool Segmentation
by: Ashraf, Tajamul, et al.
Published: (2026)
by: Ashraf, Tajamul, et al.
Published: (2026)
LIB-KD: Teaching Inductive Bias for Efficient Vision Transformer Distillation and Compression
by: Habib, Gousia, et al.
Published: (2023)
by: Habib, Gousia, et al.
Published: (2023)
DuPLUS: Dual-Prompt Vision-Language Framework for Universal Medical Image Segmentation and Prognosis
by: Saeed, Numan, et al.
Published: (2025)
by: Saeed, Numan, et al.
Published: (2025)
MedScribe: Clinically Grounded CT Reporting through Agentic Workflows
by: Orlando, Giuseppe A., et al.
Published: (2026)
by: Orlando, Giuseppe A., et al.
Published: (2026)
MIRA: A Novel Framework for Fusing Modalities in Medical RAG
by: Wang, Jinhong, et al.
Published: (2025)
by: Wang, Jinhong, et al.
Published: (2025)
A Comprehensive Review of Knowledge Distillation in Computer Vision
by: Habib, Gousia, et al.
Published: (2024)
by: Habib, Gousia, et al.
Published: (2024)
D-MASTER: Mask Annealed Transformer for Unsupervised Domain Adaptation in Breast Cancer Detection from Mammograms
by: Ashraf, Tajamul, et al.
Published: (2024)
by: Ashraf, Tajamul, et al.
Published: (2024)
GUI-C$^2$: Coarse-to-Fine GUI Grounding via Difficulty-Aware Reinforcement Learning
by: Li, Junlong, et al.
Published: (2026)
by: Li, Junlong, et al.
Published: (2026)
POINTS-GUI-G: GUI-Grounding Journey
by: Zhao, Zhongyin, et al.
Published: (2026)
by: Zhao, Zhongyin, et al.
Published: (2026)
ProtoMedAgent: Multimodal Clinical Interpretability via Privacy-Aware Agentic Workflows
by: Pellicer, Alvaro Lopez, et al.
Published: (2026)
by: Pellicer, Alvaro Lopez, et al.
Published: (2026)
AutoFocus: Uncertainty-Aware Active Visual Search for GUI Grounding
by: Yao, Ruilin, et al.
Published: (2026)
by: Yao, Ruilin, et al.
Published: (2026)
\textsc{GUI-Spotlight}: Adaptive Iterative Focus Refinement for Enhanced GUI Visual Grounding
by: Lei, Bin, et al.
Published: (2025)
by: Lei, Bin, et al.
Published: (2025)
WinDeskGround: A Benchmark for Robust GUI Grounding in Complex Multi-Window Desktop Environments
by: Zhao, Haoren, et al.
Published: (2026)
by: Zhao, Haoren, et al.
Published: (2026)
ATR-Bench: A Federated Learning Benchmark for Adaptation, Trust, and Reasoning
by: Ashraf, Tajamul, et al.
Published: (2025)
by: Ashraf, Tajamul, et al.
Published: (2025)
R-VLM: Region-Aware Vision Language Model for Precise GUI Grounding
by: Park, Joonhyung, et al.
Published: (2025)
by: Park, Joonhyung, et al.
Published: (2025)
MedSG-Bench: A Benchmark for Medical Image Sequences Grounding
by: Yue, Jingkun, et al.
Published: (2025)
by: Yue, Jingkun, et al.
Published: (2025)
CarePilot: A Multi-Agent Framework for Long-Horizon Computer Task Automation in Healthcare
by: Ghosh, Akash, et al.
Published: (2026)
by: Ghosh, Akash, et al.
Published: (2026)
QTrack: Query-Driven Reasoning for Multi-modal MOT
by: Ashraf, Tajamul, et al.
Published: (2026)
by: Ashraf, Tajamul, et al.
Published: (2026)
AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs
by: Li, Hongxin, et al.
Published: (2025)
by: Li, Hongxin, et al.
Published: (2025)
Med-R2: An Adversarial Benchmark for Evidence-Grounded Reasoning in Medical VLMs
by: Ma, Wen, et al.
Published: (2026)
by: Ma, Wen, et al.
Published: (2026)
VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding Tasks
by: Zhou, Beitong, et al.
Published: (2025)
by: Zhou, Beitong, et al.
Published: (2025)
Towards GUI Agents: Vision-Language Diffusion Models for GUI Grounding
by: Kumbhar, Shrinidhi, et al.
Published: (2026)
by: Kumbhar, Shrinidhi, et al.
Published: (2026)
GUI-ARP: Enhancing Grounding with Adaptive Region Perception for GUI Agents
by: Ye, Xianhang, et al.
Published: (2025)
by: Ye, Xianhang, et al.
Published: (2025)
V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM
by: Rahman, Abdur, et al.
Published: (2024)
by: Rahman, Abdur, et al.
Published: (2024)
MedOpenClaw and MedFlowBench: Auditing Medical Agents in Full-Study Workflows
by: Shen, Weixiang, et al.
Published: (2026)
by: Shen, Weixiang, et al.
Published: (2026)
FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis
by: Maani, Fadillah, et al.
Published: (2025)
by: Maani, Fadillah, et al.
Published: (2025)
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
by: Wu, Qianhui, et al.
Published: (2025)
by: Wu, Qianhui, et al.
Published: (2025)
DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning
by: Wu, Hang, et al.
Published: (2025)
by: Wu, Hang, et al.
Published: (2025)
FineState-Bench: Benchmarking State-Conditioned Grounding for Fine-grained GUI State Setting
by: Ji, Fengxian, et al.
Published: (2026)
by: Ji, Fengxian, et al.
Published: (2026)
AdaZoom-GUI: Adaptive Zoom-based GUI Grounding with Instruction Refinement
by: Pei, Siqi, et al.
Published: (2026)
by: Pei, Siqi, et al.
Published: (2026)
GUI-CEval: A Hierarchical and Comprehensive Chinese Benchmark for Mobile GUI Agents
by: Li, Yang, et al.
Published: (2026)
by: Li, Yang, et al.
Published: (2026)
SpineBench: A Clinically Salient, Level-Aware Benchmark Powered by the SpineMed-450k Corpus
by: Zhao, Ming, et al.
Published: (2025)
by: Zhao, Ming, et al.
Published: (2025)
Similar Items
-
Deep Learning-Based Automated Segmentation of Uterine Myomas
by: Saleem, Tausifa Jan, et al.
Published: (2025) -
MOTOR: Multimodal Optimal Transport via Grounded Retrieval in Medical Visual Question Answering
by: Shaaban, Mai A., et al.
Published: (2025) -
Knowledge Distillation in Vision Transformers: A Critical Review
by: Habib, Gousia, et al.
Published: (2023) -
Context Aware Grounded Teacher for Source Free Object Detection
by: Ashraf, Tajamul, et al.
Published: (2025) -
TITAN: Query-Token based Domain Adaptive Adversarial Learning
by: Ashraf, Tajamul, et al.
Published: (2025)