Saved in:
| Main Authors: | Lam, Hoang Khanh, Perera, Kahandakanaththage Maduni Pramuditha |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.02448 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Linear Spaces of Meanings: Compositional Structures in Vision-Language Models
by: Trager, Matthew, et al.
Published: (2023)
by: Trager, Matthew, et al.
Published: (2023)
Symmetric masking strategy enhances the performance of Masked Image Modeling
by: Nguyen, Khanh-Binh, et al.
Published: (2024)
by: Nguyen, Khanh-Binh, et al.
Published: (2024)
Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection
by: Karami, Ali, et al.
Published: (2024)
by: Karami, Ali, et al.
Published: (2024)
DARB-Splatting: Generalizing Splatting with Decaying Anisotropic Radial Basis Functions
by: Pramuditha, Hashiru, et al.
Published: (2025)
by: Pramuditha, Hashiru, et al.
Published: (2025)
Leveraging Hierarchical Image-Text Misalignment for Universal Fake Image Detection
by: Zhang, Daichi, et al.
Published: (2025)
by: Zhang, Daichi, et al.
Published: (2025)
Descriminative-Generative Custom Tokens for Vision-Language Models
by: Perera, Pramuditha, et al.
Published: (2025)
by: Perera, Pramuditha, et al.
Published: (2025)
Retro: Reusing teacher projection head for efficient embedding distillation on Lightweight Models via Self-supervised Learning
by: Nguyen, Khanh-Binh, et al.
Published: (2024)
by: Nguyen, Khanh-Binh, et al.
Published: (2024)
MobileUNETR: A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation
by: Perera, Shehan, et al.
Published: (2024)
by: Perera, Shehan, et al.
Published: (2024)
From Specialist to Generalist: Unlocking SAM's Learning Potential on Unlabeled Medical Images
by: Vu, Vi, et al.
Published: (2026)
by: Vu, Vi, et al.
Published: (2026)
PAS : Prelim Attention Score for Detecting Object Hallucinations in Large Vision--Language Models
by: Hoang-Xuan, Nhat, et al.
Published: (2025)
by: Hoang-Xuan, Nhat, et al.
Published: (2025)
Not All Tokens are Guided Equal: Improving Guidance in Visual Autoregressive Models
by: Nguyen, Ky Dan, et al.
Published: (2025)
by: Nguyen, Ky Dan, et al.
Published: (2025)
Hierarchical Deep Fusion Framework for Multi-dimensional Facial Forgery Detection -- The 2024 Global Deepfake Image Detection Challenge
by: Wang, Kohou, et al.
Published: (2025)
by: Wang, Kohou, et al.
Published: (2025)
ConPro: Learning Severity Representation for Medical Images using Contrastive Learning and Preference Optimization
by: Nguyen, Hong, et al.
Published: (2024)
by: Nguyen, Hong, et al.
Published: (2024)
Empowering Morphing Attack Detection using Interpretable Image-Text Foundation Model
by: Patwardhan, Sushrut, et al.
Published: (2025)
by: Patwardhan, Sushrut, et al.
Published: (2025)
Brain Stroke Detection and Classification Using CT Imaging with Transformer Models and Explainable AI
by: Qari, Shomukh, et al.
Published: (2025)
by: Qari, Shomukh, et al.
Published: (2025)
Digital Image Forgery Detection Using Transfer Learning
by: Buyuk, Fatma Betul, et al.
Published: (2026)
by: Buyuk, Fatma Betul, et al.
Published: (2026)
Hierarchical Question-Answering for Driving Scene Understanding Using Vision-Language Models
by: Mohamud, Safaa Abdullahi Moallim, et al.
Published: (2025)
by: Mohamud, Safaa Abdullahi Moallim, et al.
Published: (2025)
HDC: Hierarchical Distillation for Multi-level Noisy Consistency in Semi-Supervised Fetal Ultrasound Segmentation
by: Le, Tran Quoc Khanh, et al.
Published: (2025)
by: Le, Tran Quoc Khanh, et al.
Published: (2025)
Cross-modality debiasing: using language to mitigate sub-population shifts in imaging
by: Pang, Yijiang, et al.
Published: (2024)
by: Pang, Yijiang, et al.
Published: (2024)
SAMamba: Adaptive State Space Modeling with Hierarchical Vision for Infrared Small Target Detection
by: Xu, Wenhao, et al.
Published: (2025)
by: Xu, Wenhao, et al.
Published: (2025)
HELM: Hierarchical and Explicit Label Modeling with Graph Learning for Multi-Label Image Classification
by: Stoimchev, Marjan, et al.
Published: (2026)
by: Stoimchev, Marjan, et al.
Published: (2026)
SAVE: Segment Audio-Visual Easy way using Segment Anything Model
by: Nguyen, Khanh-Binh, et al.
Published: (2024)
by: Nguyen, Khanh-Binh, et al.
Published: (2024)
Feature-Enhanced TResNet for Fine-Grained Food Image Classification
by: Liu, Lulu, et al.
Published: (2025)
by: Liu, Lulu, et al.
Published: (2025)
COLI: A Hierarchical Efficient Compressor for Large Images
by: Wang, Haoran, et al.
Published: (2025)
by: Wang, Haoran, et al.
Published: (2025)
Are Vision-Language Models Ready for Dietary Assessment? Exploring the Next Frontier in AI-Powered Food Image Recognition
by: Romero-Tapiador, Sergio, et al.
Published: (2025)
by: Romero-Tapiador, Sergio, et al.
Published: (2025)
Aquila: A Hierarchically Aligned Visual-Language Model for Enhanced Remote Sensing Image Comprehension
by: Lu, Kaixuan, et al.
Published: (2024)
by: Lu, Kaixuan, et al.
Published: (2024)
Efficient and Concise Explanations for Object Detection with Gaussian-Class Activation Mapping Explainer
by: Nguyen, Quoc Khanh, et al.
Published: (2024)
by: Nguyen, Quoc Khanh, et al.
Published: (2024)
PromptGuard: An Orchestrated Prompting Framework for Principled Synthetic Text Generation for Vulnerable Populations using LLMs with Enhanced Safety, Fairness, and Controllability
by: Vu, Tung, et al.
Published: (2025)
by: Vu, Tung, et al.
Published: (2025)
Vision-Based Approach for Food Weight Estimation from 2D Images
by: Wimalasiri, Chathura, et al.
Published: (2024)
by: Wimalasiri, Chathura, et al.
Published: (2024)
Swin-TUNA : A Novel PEFT Approach for Accurate Food Image Segmentation
by: Chen, Haotian, et al.
Published: (2025)
by: Chen, Haotian, et al.
Published: (2025)
Empirical Analysis of Anomaly Detection on Hyperspectral Imaging Using Dimension Reduction Methods
by: Kim, Dongeon, et al.
Published: (2024)
by: Kim, Dongeon, et al.
Published: (2024)
Re-Scoring Using Image-Language Similarity for Few-Shot Object Detection
by: Jung, Min Jae, et al.
Published: (2023)
by: Jung, Min Jae, et al.
Published: (2023)
Detection of Autonomous Shuttles in Urban Traffic Images Using Adaptive Residual Context
by: Younes, Mohamed Aziz, et al.
Published: (2026)
by: Younes, Mohamed Aziz, et al.
Published: (2026)
Automatic Pith Detection in Tree Cross-Section Images Using Deep Learning
by: Liao, Tzu-I, et al.
Published: (2025)
by: Liao, Tzu-I, et al.
Published: (2025)
Restrictive Hierarchical Semantic Segmentation for Stratified Tooth Layer Detection
by: Banks, Ryan, et al.
Published: (2025)
by: Banks, Ryan, et al.
Published: (2025)
Hierarchical Vision-Language Learning for Medical Out-of-Distribution Detection
by: Lai, Runhe, et al.
Published: (2025)
by: Lai, Runhe, et al.
Published: (2025)
Hierarchical Concept-to-Appearance Guidance for Multi-Subject Image Generation
by: Xu, Yijia, et al.
Published: (2026)
by: Xu, Yijia, et al.
Published: (2026)
RIHA: Report-Image Hierarchical Alignment for Radiology Report Generation
by: Chen, Yucheng, et al.
Published: (2026)
by: Chen, Yucheng, et al.
Published: (2026)
Evaluating Multimodal Generative AI with Korean Educational Standards
by: Park, Sanghee, et al.
Published: (2025)
by: Park, Sanghee, et al.
Published: (2025)
Deep Learning-Based Computer Vision Models for Early Cancer Detection Using Multimodal Medical Imaging and Radiogenomic Integration Frameworks
by: Oghenekaro, Emmanuella Avwerosuoghene
Published: (2025)
by: Oghenekaro, Emmanuella Avwerosuoghene
Published: (2025)
Similar Items
-
Linear Spaces of Meanings: Compositional Structures in Vision-Language Models
by: Trager, Matthew, et al.
Published: (2023) -
Symmetric masking strategy enhances the performance of Masked Image Modeling
by: Nguyen, Khanh-Binh, et al.
Published: (2024) -
Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection
by: Karami, Ali, et al.
Published: (2024) -
DARB-Splatting: Generalizing Splatting with Decaying Anisotropic Radial Basis Functions
by: Pramuditha, Hashiru, et al.
Published: (2025) -
Leveraging Hierarchical Image-Text Misalignment for Universal Fake Image Detection
by: Zhang, Daichi, et al.
Published: (2025)