:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Felfeliyan, Banafshe, Zhou, Yuyue, Ghosh, Shrimanti, Kupper, Jessica, Liu, Shaobo, Hareendranathan, Abhilash, Jaremko, Jacob L.
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2401.06331
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

FlexICL: A Flexible Visual In-context Learning Framework for Elbow and Wrist Ultrasound Segmentation
by: Zhou, Yuyue, et al.
Published: (2025)

A Simple Framework Uniting Visual In-context Learning with Masked Image Modeling to Improve Ultrasound Segmentation
by: Zhou, Yuyue, et al.
Published: (2024)

Sam2Rad: A Segmentation Model for Medical Images with Learnable Prompts
by: Wahd, Assefa Seyoum, et al.
Published: (2024)

Robust Cross-Domain Generalization Using Unlabeled Target Data with Source-Domain Supervision
by: Zhou, Yuyue, et al.
Published: (2026)

Time-Contrastive Pretraining for In-Context Image and Video Segmentation
by: Wahd, Assefa, et al.
Published: (2025)

Reinforcement Learning for Ultrasound Image Analysis A Comprehensive Review of Advances and Applications
by: Ezzelarab, Maha, et al.
Published: (2025)

Retuve: Automated Multi-Modality Analysis of Hip Dysplasia with Open Source AI
by: McArthur, Adam, et al.
Published: (2025)

$\left|\,\circlearrowright\,\boxed{\text{BUS}}\,\right|$: A Large and Diverse Multimodal Benchmark for evaluating the ability of Vision-Language Models to understand Rebus Puzzles
by: Das, Trishanu, et al.
Published: (2025)

Knee Osteoarthritis Severity Prediction using an Attentive Multi-Scale Deep Convolutional Neural Network
by: Jain, Rohit Kumar, et al.
Published: (2021)

Assessing and Learning Alignment of Unimodal Vision and Language Models
by: Zhang, Le, et al.
Published: (2024)

Diversity Matters: Revisiting Test-Time Compute in Vision-Language Models
by: Tong, Yijie, et al.
Published: (2026)

Bridging Hidden States in Vision-Language Models
by: Fein-Ashley, Benjamin, et al.
Published: (2025)

Real Time Emotion Analysis Using Deep Learning for Education, Entertainment, and Beyond
by: Khuntia, Abhilash, et al.
Published: (2024)

Disease-informed Adaptation of Vision-Language Models
by: Zhang, Jiajin, et al.
Published: (2024)

Do Vision Language Models Need to Process Image Tokens?
by: Ghosh, Sambit, et al.
Published: (2026)

VL-OrdinalFormer: Vision Language Guided Ordinal Transformers for Interpretable Knee Osteoarthritis Grading
by: Ullah, Zahid, et al.
Published: (2025)

Bridging Visual Representation and Reinforcement Learning from Verifiable Rewards in Large Vision-Language Models
by: Han, Yuhang, et al.
Published: (2026)

H-SemiS: Hierarchical Fusion of Semi and Self-Supervised Learning for Knee Osteoarthritis Severity Grading
by: Raghaw, Chandravardhan Singh, et al.
Published: (2026)

Open World Scene Graph Generation using Vision Language Models
by: Dutta, Amartya, et al.
Published: (2025)

MarineEval: Assessing the Marine Intelligence of Vision-Language Models
by: Wong, YuK-Kwan, et al.
Published: (2025)

FETAL-GAUGE: A Benchmark for Assessing Vision-Language Models in Fetal Ultrasound
by: Alasmawi, Hussain, et al.
Published: (2025)

Assessing the Geolocation Capabilities, Limitations and Societal Risks of Generative Vision-Language Models
by: Grainge, Oliver, et al.
Published: (2025)

When Background Matters: Breaking Medical Vision Language Models by Transferable Attack
by: Ghosh, Akash, et al.
Published: (2026)

What and Where to Adapt: Structure-Semantics Co-Tuning for Machine Vision Compression via Synergistic Adapters
by: Liu, Shaobo, et al.
Published: (2026)

A Vision-Language Foundation Model for Leaf Disease Identification
by: Quoc, Khang Nguyen, et al.
Published: (2025)

Benchmarking Skeleton-based Motion Encoder Models for Clinical Applications: Estimating Parkinson's Disease Severity in Walking Sequences
by: Adeli, Vida, et al.
Published: (2024)

Advanced Smart City Monitoring: Real-Time Identification of Indian Citizen Attributes
by: Kale, Shubham, et al.
Published: (2024)

Assessing Privacy Preservation and Utility in Online Vision-Language Models
by: Chaudhari, Karmesh Siddharam, et al.
Published: (2026)

Shifting Focus: From Global Semantics to Local Prominent Features in Swin-Transformer for Knee Osteoarthritis Severity Assessment
by: Sekhri, Aymen, et al.
Published: (2024)

Transforming Precision: A Comparative Analysis of Vision Transformers, CNNs, and Traditional ML for Knee Osteoarthritis Severity Diagnosis
by: Apon, Tasnim Sakib, et al.
Published: (2024)

Rethinking Overlooked Aspects in Vision-Language Models
by: Liu, Yuan, et al.
Published: (2024)

Knowledge-Driven Vision-Language Model for Plexus Detection in Hirschsprung's Disease
by: Megahed, Youssef, et al.
Published: (2025)

Do Vision-Language Models Understand Compound Nouns?
by: Kumar, Sonal, et al.
Published: (2024)

Stacked Ensemble of Fine-Tuned CNNs for Knee Osteoarthritis Severity Grading
by: Gupta, Adarsh, et al.
Published: (2025)

Uncertainty-Aware Knowledge Distillation for Multimodal Large Language Models
by: Sun, Jingchen, et al.
Published: (2026)

POINTS1.5: Building a Vision-Language Model towards Real World Applications
by: Liu, Yuan, et al.
Published: (2024)

Visual Cues of Gender and Race are Associated with Stereotyping in Vision-Language Models
by: Lee, Messi H. J., et al.
Published: (2025)

ChronusOmni: Improving Time Awareness of Omni Large Language Models
by: Chen, Yijing, et al.
Published: (2025)

YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models
by: Nandy, Abhilash, et al.
Published: (2024)

Ocean-OCR: Towards General OCR Application via a Vision-Language Model
by: Chen, Song, et al.
Published: (2025)