:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Mohammed, Rawa, Attin, Mina, Shareef, Bryar
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2511.20956
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

NULLBUS: Multimodal Mixed-Supervision for Breast Ultrasound Segmentation via Nullable Global-Local Prompts
by: Mallina, Raja, et al.
Published: (2025)

XBusNet: Text-Guided Breast Ultrasound Segmentation via Multimodal Vision-Language Learning
by: Mallina, Raja, et al.
Published: (2025)

DiA-gnostic VLVAE: Disentangled Alignment-Constrained Vision Language Variational AutoEncoder for Robust Radiology Reporting with Missing Modalities
by: Shaik, Nagur Shareef, et al.
Published: (2025)

Advancing Offline Handwritten Text Recognition: A Systematic Review of Data Augmentation and Generation Techniques
by: Rassul, Yassin Hussein, et al.
Published: (2025)

PETAR: Localized Findings Generation with Mask-Aware Vision-Language Modeling for PET Automated Reporting
by: Maqbool, Danyal, et al.
Published: (2025)

TriAug: Out-of-Distribution Detection for Imbalanced Breast Lesion in Ultrasound
by: Ye, Yinyu, et al.
Published: (2024)

Flip Learning: Weakly Supervised Erase to Segment Nodules in Breast Ultrasound
by: Huang, Yuhao, et al.
Published: (2025)

How Culturally Aware are Vision-Language Models?
by: Burda-Lassen, Olena, et al.
Published: (2024)

Prompt-Based Safety Guidance Is Ineffective for Unlearned Text-to-Image Diffusion Models
by: Shin, Jiwoo, et al.
Published: (2025)

A Foundational Generative Model for Breast Ultrasound Image Analysis
by: Yu, Haojun, et al.
Published: (2025)

Interpreting Attention Heads for Image-to-Text Information Flow in Large Vision-Language Models
by: Kim, Jinyeong, et al.
Published: (2025)

Mitigating Object Hallucinations in Vision-Language Models through Region-Aware Attention Recalibration
by: Xu, Yuanzhi, et al.
Published: (2026)

Evaluating Hallucination in Large Vision-Language Models based on Context-Aware Object Similarities
by: Datta, Shounak, et al.
Published: (2025)

Prompt-Free SAM-Based Multi-Task Framework for Breast Ultrasound Lesion Segmentation and Classification
by: Johnny, Samuel E., et al.
Published: (2026)

Text-Aware Image Restoration with Diffusion Models
by: Min, Jaewon, et al.
Published: (2025)

Test-Time Spectrum-Aware Latent Steering for Zero-Shot Generalization in Vision-Language Models
by: Dafnis, Konstantinos M., et al.
Published: (2025)

Improving Medical Large Vision-Language Models with Abnormal-Aware Feedback
by: Zhou, Yucheng, et al.
Published: (2025)

Words or Vision: Do Vision-Language Models Have Blind Faith in Text?
by: Deng, Ailin, et al.
Published: (2025)

Anomaly-Aware Vision-Language Adapters for Zero-Shot Anomaly Detection
by: Aqeel, Muhammad, et al.
Published: (2026)

H2OVL-Mississippi Vision Language Models Technical Report
by: Galib, Shaikat, et al.
Published: (2024)

Constraint-Aware Neurosymbolic Uncertainty Quantification with Bayesian Deep Learning for Scientific Discovery
by: Alam, Shahnawaz, et al.
Published: (2026)

ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models
by: Clavié, Benjamin, et al.
Published: (2025)

FastVLM: Efficient Vision Encoding for Vision Language Models
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2024)

RadAlign: Advancing Radiology Report Generation with Vision-Language Concept Alignment
by: Gu, Difei, et al.
Published: (2025)

HiPath: Hierarchical Vision-Language Alignment for Structured Pathology Report Prediction
by: Yuan, Ruicheng, et al.
Published: (2026)

Collision-Aware Vision-Language Learning for End-to-End Driving with Multimodal Infraction Datasets
by: Koran, Alex, et al.
Published: (2026)

Text-Aware Diffusion for Policy Learning
by: Luo, Calvin, et al.
Published: (2024)

Adapting Vision-Language Models for Evaluating World Models
by: Hendriksen, Mariya, et al.
Published: (2025)

Multi-Modal Adapter for Vision-Language Models
by: Seputis, Dominykas, et al.
Published: (2024)

Efficient Medical Vision-Language Alignment Through Adapting Masked Vision Models
by: Lian, Chenyu, et al.
Published: (2025)

VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models
by: Yu, Xinlei, et al.
Published: (2025)

Situational Awareness Matters in 3D Vision Language Reasoning
by: Man, Yunze, et al.
Published: (2024)

MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization
by: Zhu, Kangyu, et al.
Published: (2024)

T2ID-CAS: Diffusion Model and Class Aware Sampling to Mitigate Class Imbalance in Neck Ultrasound Anatomical Landmark Detection
by: Varaganti, Manikanta, et al.
Published: (2025)

Generalized Category Discovery under Domain Shifts: From Vision to Vision-Language Models
by: Wang, Hongjun, et al.
Published: (2026)

Beyond Human Vision: The Role of Large Vision Language Models in Microscope Image Analysis
by: Verma, Prateek, et al.
Published: (2024)

Efficient Few-Shot Learning in Remote Sensing: Fusing Vision and Vision-Language Models
by: Chua, Jia Yun, et al.
Published: (2025)

CrossVL: Complexity-Aware Feature Routing and Paired Curriculum for Cross-View Vision-Language Detection
by: Liu, Zhipeng, et al.
Published: (2026)

Predictive Modeling for Breast Cancer Classification in the Context of Bangladeshi Patients: A Supervised Machine Learning Approach with Explainable AI
by: Islam, Taminul, et al.
Published: (2024)

CFM: Language-aligned Concept Foundation Model for Vision
by: Wittenmayer, Kai, et al.
Published: (2026)