:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Salman, Muhammad Umar, Qazi, Mohammad Areeb, Alam, Mohammed Talha
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2601.17880
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Introducing SDICE: An Index for Assessing Diversity of Synthetic Medical Datasets
by: Alam, Mohammed Talha, et al.
Published: (2024)

MAFM^3: Modular Adaptation of Foundation Models for Multi-Modal Medical AI
by: Qazi, Mohammad Areeb, et al.
Published: (2025)

DynaMMo: Dynamic Model Merging for Efficient Class Incremental Learning for Medical Images
by: Qazi, Mohammad Areeb, et al.
Published: (2024)

UNICON: UNIfied CONtinual Learning for Medical Foundational Models
by: Qazi, Mohammad Areeb, et al.
Published: (2025)

MedMerge: Merging Models for Effective Transfer Learning to Medical Imaging Tasks
by: Almakky, Ibrahim, et al.
Published: (2024)

Continual Learning in Medical Imaging: A Survey and Practical Analysis
by: Qazi, Mohammad Areeb, et al.
Published: (2024)

Projected Gradient Unlearning for Text-to-Image Diffusion Models: Defending Against Concept Revival Attacks
by: Aladawi, Aljalila, et al.
Published: (2026)

FissionFusion: Fast Geometric Generation and Hierarchical Souping for Medical Image Analysis
by: Sanjeev, Santosh, et al.
Published: (2024)

ADAM-Dehaze: Adaptive Density-Aware Multi-Stage Dehazing for Improved Object Detection in Foggy Conditions
by: AlHindaassi, Fatmah, et al.
Published: (2025)

GLOFNet -- A Multimodal Dataset for GLOF Monitoring and Prediction
by: Fatima, Zuha, et al.
Published: (2025)

AstroSpy: On detecting Fake Images in Astronomy via Joint Image-Spectral Representations
by: Alam, Mohammed Talha, et al.
Published: (2024)

FaceAnonyMixer: Cancelable Faces via Identity Consistent Latent Space Mixing
by: Alam, Mohammed Talha, et al.
Published: (2025)

FLARE up your data: Diffusion-based Augmentation Method in Astronomical Imaging
by: Alam, Mohammed Talha, et al.
Published: (2024)

XReal: Realistic Anatomy and Pathology-Aware X-ray Generation via Controllable Diffusion Model
by: Hashmi, Anees Ur Rehman, et al.
Published: (2024)

CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging
by: Imam, Raza, et al.
Published: (2024)

Robust and Calibrated Detection of Authentic Multimedia Content
by: Hashmi, Sarim, et al.
Published: (2025)

A Context-aware Attention and Graph Neural Network-based Multimodal Framework for Misogyny Detection
by: Rehman, Mohammad Zia Ur, et al.
Published: (2025)

AI-Enhanced Virtual Biopsies for Brain Tumor Diagnosis in Low Resource Settings
by: Ehsan, Areeb
Published: (2025)

CLIP-Decoder : ZeroShot Multilabel Classification using Multimodal CLIP Aligned Representation
by: Ali, Muhammad, et al.
Published: (2024)

A streamlined Approach to Multimodal Few-Shot Class Incremental Learning for Fine-Grained Datasets
by: Doan, Thang, et al.
Published: (2024)

GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing
by: Shabbir, Akashah, et al.
Published: (2025)

AnimalFormer: Multimodal Vision Framework for Behavior-based Precision Livestock Farming
by: Qazi, Ahmed, et al.
Published: (2024)

AdaptPrompt: Parameter-Efficient Adaptation of VLMs for Generalizable Deepfake Detection
by: Jiang, Yichen, et al.
Published: (2025)

A Culturally-diverse Multilingual Multimodal Video Benchmark & Model
by: Shafique, Bhuiyan Sanjid, et al.
Published: (2025)

FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation
by: Yagi, Takuma, et al.
Published: (2024)

HandVQA: Diagnosing and Improving Fine-Grained Spatial Reasoning about Hands in Vision-Language Models
by: Sayem, MD Khalequzzaman Chowdhury, et al.
Published: (2026)

T2I-FineEval: Fine-Grained Compositional Metric for Text-to-Image Evaluation
by: Hosseini, Seyed Mohammad Hadi, et al.
Published: (2025)

GenSync: A Generalized Talking Head Framework for Audio-driven Multi-Subject Lip-Sync using 3D Gaussian Splatting
by: Agarwal, Anushka, et al.
Published: (2025)

Video-R2: Reinforcing Consistent and Grounded Reasoning in Multimodal Language Models
by: Maaz, Muhammad, et al.
Published: (2025)

Chitrakshara: A Large Multilingual Multimodal Dataset for Indian languages
by: Khan, Shaharukh, et al.
Published: (2026)

DuwatBench: Bridging Language and Visual Heritage through an Arabic Calligraphy Benchmark for Multimodal Understanding
by: Patle, Shubham, et al.
Published: (2026)

BASKET: A Large-Scale Video Dataset for Fine-Grained Skill Estimation
by: Pan, Yulu, et al.
Published: (2025)

MCFNet: A Multimodal Collaborative Fusion Network for Fine-Grained Semantic Classification
by: Qiao, Yang, et al.
Published: (2025)

Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping
by: Zhang, Yujie, et al.
Published: (2025)

Benchmarking Badminton Action Recognition with a New Fine-Grained Dataset
by: Li, Qi, et al.
Published: (2024)

Maya: An Instruction Finetuned Multilingual Multimodal Model
by: Alam, Nahid, et al.
Published: (2024)

FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression Comprehension
by: Liu, Junzhuo, et al.
Published: (2024)

MotionSight: Boosting Fine-Grained Motion Understanding in Multimodal LLMs
by: Du, Yipeng, et al.
Published: (2025)

Concept Drift and Long-Tailed Distribution in Fine-Grained Visual Categorization: Benchmark and Method
by: Ye, Shuo, et al.
Published: (2023)

Jointly Learning Spatial, Angular, and Temporal Information for Enhanced Lane Detection
by: Alam, Muhammad Zeshan
Published: (2024)