Saved in:
| Main Authors: | Mohammed, Rawa, Attin, Mina, Shareef, Bryar |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.20956 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
NULLBUS: Multimodal Mixed-Supervision for Breast Ultrasound Segmentation via Nullable Global-Local Prompts
by: Mallina, Raja, et al.
Published: (2025)
by: Mallina, Raja, et al.
Published: (2025)
XBusNet: Text-Guided Breast Ultrasound Segmentation via Multimodal Vision-Language Learning
by: Mallina, Raja, et al.
Published: (2025)
by: Mallina, Raja, et al.
Published: (2025)
DiA-gnostic VLVAE: Disentangled Alignment-Constrained Vision Language Variational AutoEncoder for Robust Radiology Reporting with Missing Modalities
by: Shaik, Nagur Shareef, et al.
Published: (2025)
by: Shaik, Nagur Shareef, et al.
Published: (2025)
Advancing Offline Handwritten Text Recognition: A Systematic Review of Data Augmentation and Generation Techniques
by: Rassul, Yassin Hussein, et al.
Published: (2025)
by: Rassul, Yassin Hussein, et al.
Published: (2025)
PETAR: Localized Findings Generation with Mask-Aware Vision-Language Modeling for PET Automated Reporting
by: Maqbool, Danyal, et al.
Published: (2025)
by: Maqbool, Danyal, et al.
Published: (2025)
TriAug: Out-of-Distribution Detection for Imbalanced Breast Lesion in Ultrasound
by: Ye, Yinyu, et al.
Published: (2024)
by: Ye, Yinyu, et al.
Published: (2024)
Flip Learning: Weakly Supervised Erase to Segment Nodules in Breast Ultrasound
by: Huang, Yuhao, et al.
Published: (2025)
by: Huang, Yuhao, et al.
Published: (2025)
How Culturally Aware are Vision-Language Models?
by: Burda-Lassen, Olena, et al.
Published: (2024)
by: Burda-Lassen, Olena, et al.
Published: (2024)
Prompt-Based Safety Guidance Is Ineffective for Unlearned Text-to-Image Diffusion Models
by: Shin, Jiwoo, et al.
Published: (2025)
by: Shin, Jiwoo, et al.
Published: (2025)
A Foundational Generative Model for Breast Ultrasound Image Analysis
by: Yu, Haojun, et al.
Published: (2025)
by: Yu, Haojun, et al.
Published: (2025)
Interpreting Attention Heads for Image-to-Text Information Flow in Large Vision-Language Models
by: Kim, Jinyeong, et al.
Published: (2025)
by: Kim, Jinyeong, et al.
Published: (2025)
Mitigating Object Hallucinations in Vision-Language Models through Region-Aware Attention Recalibration
by: Xu, Yuanzhi, et al.
Published: (2026)
by: Xu, Yuanzhi, et al.
Published: (2026)
Evaluating Hallucination in Large Vision-Language Models based on Context-Aware Object Similarities
by: Datta, Shounak, et al.
Published: (2025)
by: Datta, Shounak, et al.
Published: (2025)
Prompt-Free SAM-Based Multi-Task Framework for Breast Ultrasound Lesion Segmentation and Classification
by: Johnny, Samuel E., et al.
Published: (2026)
by: Johnny, Samuel E., et al.
Published: (2026)
Text-Aware Image Restoration with Diffusion Models
by: Min, Jaewon, et al.
Published: (2025)
by: Min, Jaewon, et al.
Published: (2025)
Test-Time Spectrum-Aware Latent Steering for Zero-Shot Generalization in Vision-Language Models
by: Dafnis, Konstantinos M., et al.
Published: (2025)
by: Dafnis, Konstantinos M., et al.
Published: (2025)
Improving Medical Large Vision-Language Models with Abnormal-Aware Feedback
by: Zhou, Yucheng, et al.
Published: (2025)
by: Zhou, Yucheng, et al.
Published: (2025)
Words or Vision: Do Vision-Language Models Have Blind Faith in Text?
by: Deng, Ailin, et al.
Published: (2025)
by: Deng, Ailin, et al.
Published: (2025)
Anomaly-Aware Vision-Language Adapters for Zero-Shot Anomaly Detection
by: Aqeel, Muhammad, et al.
Published: (2026)
by: Aqeel, Muhammad, et al.
Published: (2026)
H2OVL-Mississippi Vision Language Models Technical Report
by: Galib, Shaikat, et al.
Published: (2024)
by: Galib, Shaikat, et al.
Published: (2024)
Constraint-Aware Neurosymbolic Uncertainty Quantification with Bayesian Deep Learning for Scientific Discovery
by: Alam, Shahnawaz, et al.
Published: (2026)
by: Alam, Shahnawaz, et al.
Published: (2026)
ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models
by: Clavié, Benjamin, et al.
Published: (2025)
by: Clavié, Benjamin, et al.
Published: (2025)
FastVLM: Efficient Vision Encoding for Vision Language Models
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2024)
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2024)
RadAlign: Advancing Radiology Report Generation with Vision-Language Concept Alignment
by: Gu, Difei, et al.
Published: (2025)
by: Gu, Difei, et al.
Published: (2025)
HiPath: Hierarchical Vision-Language Alignment for Structured Pathology Report Prediction
by: Yuan, Ruicheng, et al.
Published: (2026)
by: Yuan, Ruicheng, et al.
Published: (2026)
Collision-Aware Vision-Language Learning for End-to-End Driving with Multimodal Infraction Datasets
by: Koran, Alex, et al.
Published: (2026)
by: Koran, Alex, et al.
Published: (2026)
Text-Aware Diffusion for Policy Learning
by: Luo, Calvin, et al.
Published: (2024)
by: Luo, Calvin, et al.
Published: (2024)
Adapting Vision-Language Models for Evaluating World Models
by: Hendriksen, Mariya, et al.
Published: (2025)
by: Hendriksen, Mariya, et al.
Published: (2025)
Multi-Modal Adapter for Vision-Language Models
by: Seputis, Dominykas, et al.
Published: (2024)
by: Seputis, Dominykas, et al.
Published: (2024)
Efficient Medical Vision-Language Alignment Through Adapting Masked Vision Models
by: Lian, Chenyu, et al.
Published: (2025)
by: Lian, Chenyu, et al.
Published: (2025)
VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models
by: Yu, Xinlei, et al.
Published: (2025)
by: Yu, Xinlei, et al.
Published: (2025)
Situational Awareness Matters in 3D Vision Language Reasoning
by: Man, Yunze, et al.
Published: (2024)
by: Man, Yunze, et al.
Published: (2024)
MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization
by: Zhu, Kangyu, et al.
Published: (2024)
by: Zhu, Kangyu, et al.
Published: (2024)
T2ID-CAS: Diffusion Model and Class Aware Sampling to Mitigate Class Imbalance in Neck Ultrasound Anatomical Landmark Detection
by: Varaganti, Manikanta, et al.
Published: (2025)
by: Varaganti, Manikanta, et al.
Published: (2025)
Generalized Category Discovery under Domain Shifts: From Vision to Vision-Language Models
by: Wang, Hongjun, et al.
Published: (2026)
by: Wang, Hongjun, et al.
Published: (2026)
Beyond Human Vision: The Role of Large Vision Language Models in Microscope Image Analysis
by: Verma, Prateek, et al.
Published: (2024)
by: Verma, Prateek, et al.
Published: (2024)
Efficient Few-Shot Learning in Remote Sensing: Fusing Vision and Vision-Language Models
by: Chua, Jia Yun, et al.
Published: (2025)
by: Chua, Jia Yun, et al.
Published: (2025)
CrossVL: Complexity-Aware Feature Routing and Paired Curriculum for Cross-View Vision-Language Detection
by: Liu, Zhipeng, et al.
Published: (2026)
by: Liu, Zhipeng, et al.
Published: (2026)
Predictive Modeling for Breast Cancer Classification in the Context of Bangladeshi Patients: A Supervised Machine Learning Approach with Explainable AI
by: Islam, Taminul, et al.
Published: (2024)
by: Islam, Taminul, et al.
Published: (2024)
CFM: Language-aligned Concept Foundation Model for Vision
by: Wittenmayer, Kai, et al.
Published: (2026)
by: Wittenmayer, Kai, et al.
Published: (2026)
Similar Items
-
NULLBUS: Multimodal Mixed-Supervision for Breast Ultrasound Segmentation via Nullable Global-Local Prompts
by: Mallina, Raja, et al.
Published: (2025) -
XBusNet: Text-Guided Breast Ultrasound Segmentation via Multimodal Vision-Language Learning
by: Mallina, Raja, et al.
Published: (2025) -
DiA-gnostic VLVAE: Disentangled Alignment-Constrained Vision Language Variational AutoEncoder for Robust Radiology Reporting with Missing Modalities
by: Shaik, Nagur Shareef, et al.
Published: (2025) -
Advancing Offline Handwritten Text Recognition: A Systematic Review of Data Augmentation and Generation Techniques
by: Rassul, Yassin Hussein, et al.
Published: (2025) -
PETAR: Localized Findings Generation with Mask-Aware Vision-Language Modeling for PET Automated Reporting
by: Maqbool, Danyal, et al.
Published: (2025)