:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lam, Hoang Khanh, Perera, Kahandakanaththage Maduni Pramuditha
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2409.02448
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Linear Spaces of Meanings: Compositional Structures in Vision-Language Models
by: Trager, Matthew, et al.
Published: (2023)

Symmetric masking strategy enhances the performance of Masked Image Modeling
by: Nguyen, Khanh-Binh, et al.
Published: (2024)

Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection
by: Karami, Ali, et al.
Published: (2024)

DARB-Splatting: Generalizing Splatting with Decaying Anisotropic Radial Basis Functions
by: Pramuditha, Hashiru, et al.
Published: (2025)

Leveraging Hierarchical Image-Text Misalignment for Universal Fake Image Detection
by: Zhang, Daichi, et al.
Published: (2025)

Descriminative-Generative Custom Tokens for Vision-Language Models
by: Perera, Pramuditha, et al.
Published: (2025)

Retro: Reusing teacher projection head for efficient embedding distillation on Lightweight Models via Self-supervised Learning
by: Nguyen, Khanh-Binh, et al.
Published: (2024)

MobileUNETR: A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation
by: Perera, Shehan, et al.
Published: (2024)

From Specialist to Generalist: Unlocking SAM's Learning Potential on Unlabeled Medical Images
by: Vu, Vi, et al.
Published: (2026)

PAS : Prelim Attention Score for Detecting Object Hallucinations in Large Vision--Language Models
by: Hoang-Xuan, Nhat, et al.
Published: (2025)

Not All Tokens are Guided Equal: Improving Guidance in Visual Autoregressive Models
by: Nguyen, Ky Dan, et al.
Published: (2025)

Hierarchical Deep Fusion Framework for Multi-dimensional Facial Forgery Detection -- The 2024 Global Deepfake Image Detection Challenge
by: Wang, Kohou, et al.
Published: (2025)

ConPro: Learning Severity Representation for Medical Images using Contrastive Learning and Preference Optimization
by: Nguyen, Hong, et al.
Published: (2024)

Empowering Morphing Attack Detection using Interpretable Image-Text Foundation Model
by: Patwardhan, Sushrut, et al.
Published: (2025)

Brain Stroke Detection and Classification Using CT Imaging with Transformer Models and Explainable AI
by: Qari, Shomukh, et al.
Published: (2025)

Digital Image Forgery Detection Using Transfer Learning
by: Buyuk, Fatma Betul, et al.
Published: (2026)

Hierarchical Question-Answering for Driving Scene Understanding Using Vision-Language Models
by: Mohamud, Safaa Abdullahi Moallim, et al.
Published: (2025)

HDC: Hierarchical Distillation for Multi-level Noisy Consistency in Semi-Supervised Fetal Ultrasound Segmentation
by: Le, Tran Quoc Khanh, et al.
Published: (2025)

Cross-modality debiasing: using language to mitigate sub-population shifts in imaging
by: Pang, Yijiang, et al.
Published: (2024)

SAMamba: Adaptive State Space Modeling with Hierarchical Vision for Infrared Small Target Detection
by: Xu, Wenhao, et al.
Published: (2025)

HELM: Hierarchical and Explicit Label Modeling with Graph Learning for Multi-Label Image Classification
by: Stoimchev, Marjan, et al.
Published: (2026)

SAVE: Segment Audio-Visual Easy way using Segment Anything Model
by: Nguyen, Khanh-Binh, et al.
Published: (2024)

Feature-Enhanced TResNet for Fine-Grained Food Image Classification
by: Liu, Lulu, et al.
Published: (2025)

COLI: A Hierarchical Efficient Compressor for Large Images
by: Wang, Haoran, et al.
Published: (2025)

Are Vision-Language Models Ready for Dietary Assessment? Exploring the Next Frontier in AI-Powered Food Image Recognition
by: Romero-Tapiador, Sergio, et al.
Published: (2025)

Aquila: A Hierarchically Aligned Visual-Language Model for Enhanced Remote Sensing Image Comprehension
by: Lu, Kaixuan, et al.
Published: (2024)

Efficient and Concise Explanations for Object Detection with Gaussian-Class Activation Mapping Explainer
by: Nguyen, Quoc Khanh, et al.
Published: (2024)

PromptGuard: An Orchestrated Prompting Framework for Principled Synthetic Text Generation for Vulnerable Populations using LLMs with Enhanced Safety, Fairness, and Controllability
by: Vu, Tung, et al.
Published: (2025)

Vision-Based Approach for Food Weight Estimation from 2D Images
by: Wimalasiri, Chathura, et al.
Published: (2024)

Swin-TUNA : A Novel PEFT Approach for Accurate Food Image Segmentation
by: Chen, Haotian, et al.
Published: (2025)

Empirical Analysis of Anomaly Detection on Hyperspectral Imaging Using Dimension Reduction Methods
by: Kim, Dongeon, et al.
Published: (2024)

Re-Scoring Using Image-Language Similarity for Few-Shot Object Detection
by: Jung, Min Jae, et al.
Published: (2023)

Detection of Autonomous Shuttles in Urban Traffic Images Using Adaptive Residual Context
by: Younes, Mohamed Aziz, et al.
Published: (2026)

Automatic Pith Detection in Tree Cross-Section Images Using Deep Learning
by: Liao, Tzu-I, et al.
Published: (2025)

Restrictive Hierarchical Semantic Segmentation for Stratified Tooth Layer Detection
by: Banks, Ryan, et al.
Published: (2025)

Hierarchical Vision-Language Learning for Medical Out-of-Distribution Detection
by: Lai, Runhe, et al.
Published: (2025)

Hierarchical Concept-to-Appearance Guidance for Multi-Subject Image Generation
by: Xu, Yijia, et al.
Published: (2026)

RIHA: Report-Image Hierarchical Alignment for Radiology Report Generation
by: Chen, Yucheng, et al.
Published: (2026)

Evaluating Multimodal Generative AI with Korean Educational Standards
by: Park, Sanghee, et al.
Published: (2025)

Deep Learning-Based Computer Vision Models for Early Cancer Detection Using Multimodal Medical Imaging and Radiogenomic Integration Frameworks
by: Oghenekaro, Emmanuella Avwerosuoghene
Published: (2025)