Saved in:
| Main Authors: | Anonto, Riad Ahmed, Zabin, Sardar Md. Saffat, Rahman, M. Saifur |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.18369 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DFCon: Attention-Driven Supervised Contrastive Learning for Robust Deepfake Detection
by: Shanto, MD Sadik Hossain, et al.
Published: (2025)
by: Shanto, MD Sadik Hossain, et al.
Published: (2025)
Two Decades of Bengali Handwritten Digit Recognition: A Survey
by: Rahman, A. B. M. Ashikur, et al.
Published: (2022)
by: Rahman, A. B. M. Ashikur, et al.
Published: (2022)
BeHGAN: Bengali Handwritten Word Generation from Plain Text Using Generative Adversarial Networks
by: Islam, Md. Rakibul, et al.
Published: (2025)
by: Islam, Md. Rakibul, et al.
Published: (2025)
Pay Attention to Where You Looked
by: Berian, Alex, et al.
Published: (2026)
by: Berian, Alex, et al.
Published: (2026)
Tell Model Where to Look: Mitigating Hallucinations in MLLMs by Vision-Guided Attention
by: Zhao, Jianfei, et al.
Published: (2025)
by: Zhao, Jianfei, et al.
Published: (2025)
Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning
by: Tu, Yunbin, et al.
Published: (2024)
by: Tu, Yunbin, et al.
Published: (2024)
BdSL-SPOTER: A Transformer-Based Framework for Bengali Sign Language Recognition with Cultural Adaptation
by: Azad, Sayad Ibna, et al.
Published: (2025)
by: Azad, Sayad Ibna, et al.
Published: (2025)
Knowing Where to Focus: Attention-Guided Alignment for Text-based Person Search
by: Tan, Lei, et al.
Published: (2024)
by: Tan, Lei, et al.
Published: (2024)
Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning
by: Ye, Qinghao, et al.
Published: (2025)
by: Ye, Qinghao, et al.
Published: (2025)
Guided Attention for Interpretable Motion Captioning
by: Radouane, Karim, et al.
Published: (2023)
by: Radouane, Karim, et al.
Published: (2023)
Leveraging Complementary Attention maps in vision transformers for OCT image analysis
by: Shahgir, Haz Sameen, et al.
Published: (2023)
by: Shahgir, Haz Sameen, et al.
Published: (2023)
PULSAR: Graph based Positive Unlabeled Learning with Multi Stream Adaptive Convolutions for Parkinson's Disease Recognition
by: Alam, Md. Zarif Ul, et al.
Published: (2023)
by: Alam, Md. Zarif Ul, et al.
Published: (2023)
An Efficient Dual-Line Decoder Network with Multi-Scale Convolutional Attention for Multi-organ Segmentation
by: Hassan, Riad, et al.
Published: (2025)
by: Hassan, Riad, et al.
Published: (2025)
AGIC: Attention-Guided Image Captioning to Improve Caption Relevance
by: Teja, L. D. M. S. Sai, et al.
Published: (2025)
by: Teja, L. D. M. S. Sai, et al.
Published: (2025)
Beyond Dominant Patches: Spatial Credit Redistribution For Grounded Vision-Language Models
by: Samin, Niamul Hassan, et al.
Published: (2026)
by: Samin, Niamul Hassan, et al.
Published: (2026)
SSTAF: Spatial-Spectral-Temporal Attention Fusion Transformer for Motor Imagery Classification
by: Muna, Ummay Maria, et al.
Published: (2025)
by: Muna, Ummay Maria, et al.
Published: (2025)
A light-weight model to generate NDWI from Sentinel-1
by: Ahmed, Saleh Sakib, et al.
Published: (2025)
by: Ahmed, Saleh Sakib, et al.
Published: (2025)
Beyond Where to Look: Trajectory-Guided Reinforcement Learning for Multimodal RLVR
by: Lu, Jinda, et al.
Published: (2026)
by: Lu, Jinda, et al.
Published: (2026)
Bengali Sign Language Recognition through Hand Pose Estimation using Multi-Branch Spatial-Temporal Attention Model
by: Miah, Abu Saleh Musa, et al.
Published: (2024)
by: Miah, Abu Saleh Musa, et al.
Published: (2024)
When Safety Blocks Sense: Measuring Semantic Confusion in LLM Refusals
by: Anonto, Riad Ahmed, et al.
Published: (2025)
by: Anonto, Riad Ahmed, et al.
Published: (2025)
Impact of Tuning Parameters in Deep Convolutional Neural Network Using a Crack Image Dataset
by: Zabin, Mahe, et al.
Published: (2025)
by: Zabin, Mahe, et al.
Published: (2025)
Enhancement of Bengali OCR by Specialized Models and Advanced Techniques for Diverse Document Types
by: Rabby, AKM Shahariar Azad, et al.
Published: (2024)
by: Rabby, AKM Shahariar Azad, et al.
Published: (2024)
Compressed Image Captioning using CNN-based Encoder-Decoder Framework
by: Ridoy, Md Alif Rahman, et al.
Published: (2024)
by: Ridoy, Md Alif Rahman, et al.
Published: (2024)
LookWhere? Efficient Visual Recognition by Learning Where to Look and What to See from Self-Supervision
by: Fuller, Anthony, et al.
Published: (2025)
by: Fuller, Anthony, et al.
Published: (2025)
GraDeT-HTR: A Resource-Efficient Bengali Handwritten Text Recognition System utilizing Grapheme-based Tokenizer and Decoder-only Transformer
by: Hasan, Md. Mahmudul, et al.
Published: (2025)
by: Hasan, Md. Mahmudul, et al.
Published: (2025)
Representation Alignment Contrastive Regularization for Multi-Object Tracking
by: Liu, Zhonglin, et al.
Published: (2024)
by: Liu, Zhonglin, et al.
Published: (2024)
One Patch to Caption Them All: A Unified Zero-Shot Captioning Framework
by: Bianchi, Lorenzo, et al.
Published: (2025)
by: Bianchi, Lorenzo, et al.
Published: (2025)
Adaptive Enhancement and Dual-Pooling Sequential Attention for Lightweight Underwater Object Detection with YOLOv10
by: Rahman, Md. Mushibur, et al.
Published: (2026)
by: Rahman, Md. Mushibur, et al.
Published: (2026)
Embedded Heterogeneous Attention Transformer for Cross-lingual Image Captioning
by: Song, Zijie, et al.
Published: (2023)
by: Song, Zijie, et al.
Published: (2023)
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
by: Ma, Ziping, et al.
Published: (2024)
by: Ma, Ziping, et al.
Published: (2024)
Enhancing Cross-Patient Generalization in AI-Based Parkinson s Disease Detection
by: Albani, Mhd Adnan, et al.
Published: (2025)
by: Albani, Mhd Adnan, et al.
Published: (2025)
Inserting Faces inside Captions: Image Captioning with Attention Guided Merging
by: Tevissen, Yannis, et al.
Published: (2024)
by: Tevissen, Yannis, et al.
Published: (2024)
CA-IDD: Cross-Attention Guided Identity-Conditional Diffusion for Identity-Consistent Face Swapping
by: Rana, Md Shohel, et al.
Published: (2026)
by: Rana, Md Shohel, et al.
Published: (2026)
PatchAlign3D: Local Feature Alignment for Dense 3D Shape understanding
by: Hadgi, Souhail, et al.
Published: (2026)
by: Hadgi, Souhail, et al.
Published: (2026)
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers
by: Chowdhury, Md Abtahi Majeed, et al.
Published: (2025)
by: Chowdhury, Md Abtahi Majeed, et al.
Published: (2025)
Learning to Look: Cognitive Attention Alignment with Vision-Language Models
by: Yang, Ryan L., et al.
Published: (2025)
by: Yang, Ryan L., et al.
Published: (2025)
The Art of Saying "Maybe": A Conformal Lens for Uncertainty Benchmarking in VLMs
by: Azad, Asif, et al.
Published: (2025)
by: Azad, Asif, et al.
Published: (2025)
MoRe: Class Patch Attention Needs Regularization for Weakly Supervised Semantic Segmentation
by: Yang, Zhiwei, et al.
Published: (2024)
by: Yang, Zhiwei, et al.
Published: (2024)
Cross Modification Attention Based Deliberation Model for Image Captioning
by: Lian, Zheng, et al.
Published: (2021)
by: Lian, Zheng, et al.
Published: (2021)
Dynamic Cross-Modal Alignment for Robust Semantic Location Prediction
by: Jing, Liu, et al.
Published: (2024)
by: Jing, Liu, et al.
Published: (2024)
Similar Items
-
DFCon: Attention-Driven Supervised Contrastive Learning for Robust Deepfake Detection
by: Shanto, MD Sadik Hossain, et al.
Published: (2025) -
Two Decades of Bengali Handwritten Digit Recognition: A Survey
by: Rahman, A. B. M. Ashikur, et al.
Published: (2022) -
BeHGAN: Bengali Handwritten Word Generation from Plain Text Using Generative Adversarial Networks
by: Islam, Md. Rakibul, et al.
Published: (2025) -
Pay Attention to Where You Looked
by: Berian, Alex, et al.
Published: (2026) -
Tell Model Where to Look: Mitigating Hallucinations in MLLMs by Vision-Guided Attention
by: Zhao, Jianfei, et al.
Published: (2025)