:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kazi, Nur Mohammad, Khaled, Ibteshum, Galib, Md. Luthful Hasan, Shihab, Ali Faruk, Islam, Md. Rakibul
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2604.17439
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Explainable AI-Driven Detection of Human Monkeypox Using Deep Learning and Vision Transformers: A Comprehensive Analysis
by: Hossain, Md. Zahid, et al.
Published: (2025)

Real-Time Multi-Modal Embedded Vision Framework for Object Detection Facial Emotion Recognition and Biometric Identification on Low-Power Edge Platforms
by: Zahid, S. M. Khalid Bin, et al.
Published: (2026)

A Two-Stage Multitask Vision-Language Framework for Explainable Crop Disease Visual Question Answering
by: Hossain, Md. Zahid, et al.
Published: (2026)

Vision-Language Models for Automated Chest X-ray Interpretation: Leveraging ViT and GPT-2
by: Islam, Md. Rakibul, et al.
Published: (2025)

CAST: Channel-Aware Spatial Transfer Learning with Pseudo-Image Radar for Sign Language Recognition
by: Shujon, Md. Shakhoyat Rahman, et al.
Published: (2026)

Benchmark Analysis of Various Pre-trained Deep Learning Models on ASSIRA Cats and Dogs Dataset
by: Himel, Galib Muhammad Shahriar, et al.
Published: (2024)

Less Is More? Selective Visual Attention to High-Importance Regions for Multimodal Radiology Summarization
by: Naznin, Mst. Fahmida Sultana, et al.
Published: (2026)

OncoVision: Integrating Mammography and Clinical Data through Attention-Driven Multimodal AI for Enhanced Breast Cancer Diagnosis
by: Ahmed, Istiak, et al.
Published: (2025)

MADE-for-ASD: A Multi-Atlas Deep Ensemble Network for Diagnosing Autism Spectrum Disorder
by: Liu, Xuehan, et al.
Published: (2024)

Exploring the Efficacy of Modified Transfer Learning in Identifying Parkinson's Disease Through Drawn Image Patterns
by: Daiyan, Nabil, et al.
Published: (2025)

FUSE: Unifying Spectral and Semantic Cues for Robust AI-Generated Image Detection
by: Hossain, Md. Zahid, et al.
Published: (2025)

Exploring Synergistic Ensemble Learning: Uniting CNNs, MLP-Mixers, and Vision Transformers to Enhance Image Classification
by: Bashar, Mk, et al.
Published: (2025)

An Explainable Vision-Language Model Framework with Adaptive PID-Tversky Loss for Lumbar Spinal Stenosis Diagnosis
by: Sk., Md. Sajeebul Islam, et al.
Published: (2026)

HeBA: Heterogeneous Bottleneck Adapters for Robust Vision-Language Models
by: Islam, Md Jahidul
Published: (2026)

A Domain-Adapted Lightweight Ensemble for Resource-Efficient Few-Shot Plant Disease Classification
by: Islam, Anika, et al.
Published: (2025)

Privacy-Preserving Empathy Detection in Video Interactions
by: Hasan, Md Rakibul, et al.
Published: (2025)

Object Detection and Tracking
by: Pranto, Md, et al.
Published: (2025)

Bengali Sign Language Recognition through Hand Pose Estimation using Multi-Branch Spatial-Temporal Attention Model
by: Miah, Abu Saleh Musa, et al.
Published: (2024)

Fine-Tuned CNN-Based Approach for Multi-Class Mango Leaf Disease Detection
by: Ahmmed, Jalal, et al.
Published: (2025)

Mam-App: A Novel Parameter-Efficient Mamba Model for Apple Leaf Disease Classification
by: Mahamood, Md Nadim, et al.
Published: (2026)

Skin Cancer Segmentation and Classification Using Vision Transformer for Automatic Analysis in Dermatoscopy-based Non-invasive Digital System
by: Himel, Galib Muhammad Shahriar, et al.
Published: (2024)

Vision-Based Lane Following and Traffic Sign Recognition for Resource-Constrained Autonomous Vehicles
by: Islam, Md Tanjemul, et al.
Published: (2026)

Size and Smoothness Aware Adaptive Focal Loss for Small Tumor Segmentation
by: Islam, Md Rakibul, et al.
Published: (2024)

Pavlok-Nudge: A Feedback Mechanism for Atomic Behaviour Modification with Snoring Usecase
by: Hasan, Md Rakibul, et al.
Published: (2023)

Privacy-Preserving Chest X-ray Report Generation via Multimodal Federated Learning with ViT and GPT-2
by: Hossain, Md. Zahid, et al.
Published: (2025)

GraphFusion3D: Dynamic Graph Attention Convolution with Adaptive Cross-Modal Transformer for 3D Object Detection
by: Mia, Md Sohag, et al.
Published: (2025)

ReHARK: Refined Hybrid Adaptive RBF Kernels for Robust One-Shot Vision-Language Adaptation
by: Islam, Md Jahidul
Published: (2026)

BeHGAN: Bengali Handwritten Word Generation from Plain Text Using Generative Adversarial Networks
by: Islam, Md. Rakibul, et al.
Published: (2025)

Design and Development of a Low-Cost Scalable GSM-IoT Smart Pet Feeder with a Remote Mobile Application
by: Nishat, Md. Rakibul Hasan, et al.
Published: (2026)

SloMo-Fast: Slow-Momentum and Fast-Adaptive Teachers for Source-Free Continual Test-Time Adaptation
by: Iftee, Md Akil Raihan, et al.
Published: (2025)

Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation
by: Rahman, Kazi Mahathir, et al.
Published: (2025)

ElderFallGuard: Real-Time IoT and Computer Vision-Based Fall Detection System for Elderly Safety
by: Riahi, Tasrifur, et al.
Published: (2025)

EVCC: Enhanced Vision Transformer-ConvNeXt-CoAtNet Fusion for Classification
by: Hasan, Kazi Reyazul, et al.
Published: (2025)

DANet: Enhancing Small Object Detection through an Efficient Deformable Attention Network
by: Mia, Md Sohag, et al.
Published: (2023)

A Systematic Literature Review on Deep Learning-based Depth Estimation in Computer Vision
by: Rohan, Ali, et al.
Published: (2025)

GraDeT-HTR: A Resource-Efficient Bengali Handwritten Text Recognition System utilizing Grapheme-based Tokenizer and Decoder-only Transformer
by: Hasan, Md. Mahmudul, et al.
Published: (2025)

PEFT A2Z: Parameter-Efficient Fine-Tuning Survey for Large Language and Vision Models
by: Prottasha, Nusrat Jahan, et al.
Published: (2025)

pFedBBN: A Personalized Federated Test-Time Adaptation with Balanced Batch Normalization for Class-Imbalanced Data
by: Iftee, Md Akil Raihan, et al.
Published: (2025)

HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models
by: Arif, Kazi Hasan Ibn, et al.
Published: (2024)

IKIWISI: An Interactive Visual Pattern Generator for Evaluating the Reliability of Vision-Language Models Without Ground Truth
by: Islam, Md Touhidul, et al.
Published: (2025)