Saved in:
| Main Authors: | Felfeliyan, Banafshe, Zhou, Yuyue, Ghosh, Shrimanti, Kupper, Jessica, Liu, Shaobo, Hareendranathan, Abhilash, Jaremko, Jacob L. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.06331 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
FlexICL: A Flexible Visual In-context Learning Framework for Elbow and Wrist Ultrasound Segmentation
by: Zhou, Yuyue, et al.
Published: (2025)
by: Zhou, Yuyue, et al.
Published: (2025)
A Simple Framework Uniting Visual In-context Learning with Masked Image Modeling to Improve Ultrasound Segmentation
by: Zhou, Yuyue, et al.
Published: (2024)
by: Zhou, Yuyue, et al.
Published: (2024)
Sam2Rad: A Segmentation Model for Medical Images with Learnable Prompts
by: Wahd, Assefa Seyoum, et al.
Published: (2024)
by: Wahd, Assefa Seyoum, et al.
Published: (2024)
Robust Cross-Domain Generalization Using Unlabeled Target Data with Source-Domain Supervision
by: Zhou, Yuyue, et al.
Published: (2026)
by: Zhou, Yuyue, et al.
Published: (2026)
Time-Contrastive Pretraining for In-Context Image and Video Segmentation
by: Wahd, Assefa, et al.
Published: (2025)
by: Wahd, Assefa, et al.
Published: (2025)
Reinforcement Learning for Ultrasound Image Analysis A Comprehensive Review of Advances and Applications
by: Ezzelarab, Maha, et al.
Published: (2025)
by: Ezzelarab, Maha, et al.
Published: (2025)
Retuve: Automated Multi-Modality Analysis of Hip Dysplasia with Open Source AI
by: McArthur, Adam, et al.
Published: (2025)
by: McArthur, Adam, et al.
Published: (2025)
$\left|\,\circlearrowright\,\boxed{\text{BUS}}\,\right|$: A Large and Diverse Multimodal Benchmark for evaluating the ability of Vision-Language Models to understand Rebus Puzzles
by: Das, Trishanu, et al.
Published: (2025)
by: Das, Trishanu, et al.
Published: (2025)
Knee Osteoarthritis Severity Prediction using an Attentive Multi-Scale Deep Convolutional Neural Network
by: Jain, Rohit Kumar, et al.
Published: (2021)
by: Jain, Rohit Kumar, et al.
Published: (2021)
Assessing and Learning Alignment of Unimodal Vision and Language Models
by: Zhang, Le, et al.
Published: (2024)
by: Zhang, Le, et al.
Published: (2024)
Diversity Matters: Revisiting Test-Time Compute in Vision-Language Models
by: Tong, Yijie, et al.
Published: (2026)
by: Tong, Yijie, et al.
Published: (2026)
Bridging Hidden States in Vision-Language Models
by: Fein-Ashley, Benjamin, et al.
Published: (2025)
by: Fein-Ashley, Benjamin, et al.
Published: (2025)
Real Time Emotion Analysis Using Deep Learning for Education, Entertainment, and Beyond
by: Khuntia, Abhilash, et al.
Published: (2024)
by: Khuntia, Abhilash, et al.
Published: (2024)
Disease-informed Adaptation of Vision-Language Models
by: Zhang, Jiajin, et al.
Published: (2024)
by: Zhang, Jiajin, et al.
Published: (2024)
Do Vision Language Models Need to Process Image Tokens?
by: Ghosh, Sambit, et al.
Published: (2026)
by: Ghosh, Sambit, et al.
Published: (2026)
VL-OrdinalFormer: Vision Language Guided Ordinal Transformers for Interpretable Knee Osteoarthritis Grading
by: Ullah, Zahid, et al.
Published: (2025)
by: Ullah, Zahid, et al.
Published: (2025)
Bridging Visual Representation and Reinforcement Learning from Verifiable Rewards in Large Vision-Language Models
by: Han, Yuhang, et al.
Published: (2026)
by: Han, Yuhang, et al.
Published: (2026)
H-SemiS: Hierarchical Fusion of Semi and Self-Supervised Learning for Knee Osteoarthritis Severity Grading
by: Raghaw, Chandravardhan Singh, et al.
Published: (2026)
by: Raghaw, Chandravardhan Singh, et al.
Published: (2026)
Open World Scene Graph Generation using Vision Language Models
by: Dutta, Amartya, et al.
Published: (2025)
by: Dutta, Amartya, et al.
Published: (2025)
MarineEval: Assessing the Marine Intelligence of Vision-Language Models
by: Wong, YuK-Kwan, et al.
Published: (2025)
by: Wong, YuK-Kwan, et al.
Published: (2025)
FETAL-GAUGE: A Benchmark for Assessing Vision-Language Models in Fetal Ultrasound
by: Alasmawi, Hussain, et al.
Published: (2025)
by: Alasmawi, Hussain, et al.
Published: (2025)
Assessing the Geolocation Capabilities, Limitations and Societal Risks of Generative Vision-Language Models
by: Grainge, Oliver, et al.
Published: (2025)
by: Grainge, Oliver, et al.
Published: (2025)
When Background Matters: Breaking Medical Vision Language Models by Transferable Attack
by: Ghosh, Akash, et al.
Published: (2026)
by: Ghosh, Akash, et al.
Published: (2026)
What and Where to Adapt: Structure-Semantics Co-Tuning for Machine Vision Compression via Synergistic Adapters
by: Liu, Shaobo, et al.
Published: (2026)
by: Liu, Shaobo, et al.
Published: (2026)
A Vision-Language Foundation Model for Leaf Disease Identification
by: Quoc, Khang Nguyen, et al.
Published: (2025)
by: Quoc, Khang Nguyen, et al.
Published: (2025)
Benchmarking Skeleton-based Motion Encoder Models for Clinical Applications: Estimating Parkinson's Disease Severity in Walking Sequences
by: Adeli, Vida, et al.
Published: (2024)
by: Adeli, Vida, et al.
Published: (2024)
Advanced Smart City Monitoring: Real-Time Identification of Indian Citizen Attributes
by: Kale, Shubham, et al.
Published: (2024)
by: Kale, Shubham, et al.
Published: (2024)
Assessing Privacy Preservation and Utility in Online Vision-Language Models
by: Chaudhari, Karmesh Siddharam, et al.
Published: (2026)
by: Chaudhari, Karmesh Siddharam, et al.
Published: (2026)
Shifting Focus: From Global Semantics to Local Prominent Features in Swin-Transformer for Knee Osteoarthritis Severity Assessment
by: Sekhri, Aymen, et al.
Published: (2024)
by: Sekhri, Aymen, et al.
Published: (2024)
Transforming Precision: A Comparative Analysis of Vision Transformers, CNNs, and Traditional ML for Knee Osteoarthritis Severity Diagnosis
by: Apon, Tasnim Sakib, et al.
Published: (2024)
by: Apon, Tasnim Sakib, et al.
Published: (2024)
Rethinking Overlooked Aspects in Vision-Language Models
by: Liu, Yuan, et al.
Published: (2024)
by: Liu, Yuan, et al.
Published: (2024)
Knowledge-Driven Vision-Language Model for Plexus Detection in Hirschsprung's Disease
by: Megahed, Youssef, et al.
Published: (2025)
by: Megahed, Youssef, et al.
Published: (2025)
Do Vision-Language Models Understand Compound Nouns?
by: Kumar, Sonal, et al.
Published: (2024)
by: Kumar, Sonal, et al.
Published: (2024)
Stacked Ensemble of Fine-Tuned CNNs for Knee Osteoarthritis Severity Grading
by: Gupta, Adarsh, et al.
Published: (2025)
by: Gupta, Adarsh, et al.
Published: (2025)
Uncertainty-Aware Knowledge Distillation for Multimodal Large Language Models
by: Sun, Jingchen, et al.
Published: (2026)
by: Sun, Jingchen, et al.
Published: (2026)
POINTS1.5: Building a Vision-Language Model towards Real World Applications
by: Liu, Yuan, et al.
Published: (2024)
by: Liu, Yuan, et al.
Published: (2024)
Visual Cues of Gender and Race are Associated with Stereotyping in Vision-Language Models
by: Lee, Messi H. J., et al.
Published: (2025)
by: Lee, Messi H. J., et al.
Published: (2025)
ChronusOmni: Improving Time Awareness of Omni Large Language Models
by: Chen, Yijing, et al.
Published: (2025)
by: Chen, Yijing, et al.
Published: (2025)
YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models
by: Nandy, Abhilash, et al.
Published: (2024)
by: Nandy, Abhilash, et al.
Published: (2024)
Ocean-OCR: Towards General OCR Application via a Vision-Language Model
by: Chen, Song, et al.
Published: (2025)
by: Chen, Song, et al.
Published: (2025)
Similar Items
-
FlexICL: A Flexible Visual In-context Learning Framework for Elbow and Wrist Ultrasound Segmentation
by: Zhou, Yuyue, et al.
Published: (2025) -
A Simple Framework Uniting Visual In-context Learning with Masked Image Modeling to Improve Ultrasound Segmentation
by: Zhou, Yuyue, et al.
Published: (2024) -
Sam2Rad: A Segmentation Model for Medical Images with Learnable Prompts
by: Wahd, Assefa Seyoum, et al.
Published: (2024) -
Robust Cross-Domain Generalization Using Unlabeled Target Data with Source-Domain Supervision
by: Zhou, Yuyue, et al.
Published: (2026) -
Time-Contrastive Pretraining for In-Context Image and Video Segmentation
by: Wahd, Assefa, et al.
Published: (2025)