:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Shakeel, Rozain, Ali, Abdul Rahman Mohammad, Mushtaq, Muneeb, Saleem, Tausifa Jan, Ashraf, Tajamul
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.19993
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Deep Learning-Based Automated Segmentation of Uterine Myomas
by: Saleem, Tausifa Jan, et al.
Published: (2025)

MOTOR: Multimodal Optimal Transport via Grounded Retrieval in Medical Visual Question Answering
by: Shaaban, Mai A., et al.
Published: (2025)

Knowledge Distillation in Vision Transformers: A Critical Review
by: Habib, Gousia, et al.
Published: (2023)

Context Aware Grounded Teacher for Source Free Object Detection
by: Ashraf, Tajamul, et al.
Published: (2025)

TITAN: Query-Token based Domain Adaptive Adversarial Learning
by: Ashraf, Tajamul, et al.
Published: (2025)

FATE: Focal-modulated Attention Encoder for Multivariate Time-series Forecasting
by: Ashraf, Tajamul, et al.
Published: (2024)

HF-Fed: Hierarchical based customized Federated Learning Framework for X-Ray Imaging
by: Ashraf, Tajamul, et al.
Published: (2024)

Generalizable Federated Learning using Client Adaptive Focal Modulation
by: Ashraf, Tajamul, et al.
Published: (2025)

GroundedSurg: A Multi-Procedure Benchmark for Language-Conditioned Surgical Tool Segmentation
by: Ashraf, Tajamul, et al.
Published: (2026)

LIB-KD: Teaching Inductive Bias for Efficient Vision Transformer Distillation and Compression
by: Habib, Gousia, et al.
Published: (2023)

DuPLUS: Dual-Prompt Vision-Language Framework for Universal Medical Image Segmentation and Prognosis
by: Saeed, Numan, et al.
Published: (2025)

MedScribe: Clinically Grounded CT Reporting through Agentic Workflows
by: Orlando, Giuseppe A., et al.
Published: (2026)

MIRA: A Novel Framework for Fusing Modalities in Medical RAG
by: Wang, Jinhong, et al.
Published: (2025)

A Comprehensive Review of Knowledge Distillation in Computer Vision
by: Habib, Gousia, et al.
Published: (2024)

D-MASTER: Mask Annealed Transformer for Unsupervised Domain Adaptation in Breast Cancer Detection from Mammograms
by: Ashraf, Tajamul, et al.
Published: (2024)

GUI-C$^2$: Coarse-to-Fine GUI Grounding via Difficulty-Aware Reinforcement Learning
by: Li, Junlong, et al.
Published: (2026)

POINTS-GUI-G: GUI-Grounding Journey
by: Zhao, Zhongyin, et al.
Published: (2026)

ProtoMedAgent: Multimodal Clinical Interpretability via Privacy-Aware Agentic Workflows
by: Pellicer, Alvaro Lopez, et al.
Published: (2026)

AutoFocus: Uncertainty-Aware Active Visual Search for GUI Grounding
by: Yao, Ruilin, et al.
Published: (2026)

\textsc{GUI-Spotlight}: Adaptive Iterative Focus Refinement for Enhanced GUI Visual Grounding
by: Lei, Bin, et al.
Published: (2025)

WinDeskGround: A Benchmark for Robust GUI Grounding in Complex Multi-Window Desktop Environments
by: Zhao, Haoren, et al.
Published: (2026)

ATR-Bench: A Federated Learning Benchmark for Adaptation, Trust, and Reasoning
by: Ashraf, Tajamul, et al.
Published: (2025)

R-VLM: Region-Aware Vision Language Model for Precise GUI Grounding
by: Park, Joonhyung, et al.
Published: (2025)

MedSG-Bench: A Benchmark for Medical Image Sequences Grounding
by: Yue, Jingkun, et al.
Published: (2025)

CarePilot: A Multi-Agent Framework for Long-Horizon Computer Task Automation in Healthcare
by: Ghosh, Akash, et al.
Published: (2026)

QTrack: Query-Driven Reasoning for Multi-modal MOT
by: Ashraf, Tajamul, et al.
Published: (2026)

AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs
by: Li, Hongxin, et al.
Published: (2025)

Med-R2: An Adversarial Benchmark for Evidence-Grounded Reasoning in Medical VLMs
by: Ma, Wen, et al.
Published: (2026)

VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding Tasks
by: Zhou, Beitong, et al.
Published: (2025)

Towards GUI Agents: Vision-Language Diffusion Models for GUI Grounding
by: Kumbhar, Shrinidhi, et al.
Published: (2026)

GUI-ARP: Enhancing Grounding with Adaptive Region Perception for GUI Agents
by: Ye, Xianhang, et al.
Published: (2025)

V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM
by: Rahman, Abdur, et al.
Published: (2024)

MedOpenClaw and MedFlowBench: Auditing Medical Agents in Full-Study Workflows
by: Shen, Weixiang, et al.
Published: (2026)

FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis
by: Maani, Fadillah, et al.
Published: (2025)

GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
by: Wu, Qianhui, et al.
Published: (2025)

DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning
by: Wu, Hang, et al.
Published: (2025)

FineState-Bench: Benchmarking State-Conditioned Grounding for Fine-grained GUI State Setting
by: Ji, Fengxian, et al.
Published: (2026)

AdaZoom-GUI: Adaptive Zoom-based GUI Grounding with Instruction Refinement
by: Pei, Siqi, et al.
Published: (2026)

GUI-CEval: A Hierarchical and Comprehensive Chinese Benchmark for Mobile GUI Agents
by: Li, Yang, et al.
Published: (2026)

SpineBench: A Clinically Salient, Level-Aware Benchmark Powered by the SpineMed-450k Corpus
by: Zhao, Ming, et al.
Published: (2025)