:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Cheng, Jiajun, Zhao, Xianwu, Liu, Sainan, Yu, Xiaofan, Prakash, Ravi, Codd, Patrick J., Katz, Jonathan Elliott, Lin, Shan
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2505.10764
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

TrajPred: Trajectory-Conditioned Joint Embedding Prediction for Surgical Instrument-Tissue Interaction Recognition in Vision-Language Models
by: Cheng, Jiajun, et al.
Published: (2026)

PalpAid: Multimodal Pneumatic Tactile Sensor for Tissue Palpation
by: Yuliarti, Devi, et al.
Published: (2025)

Sampling-Based Model Predictive Control for Volumetric Ablation in Robotic Laser Surgery
by: Wang, Vincent Y., et al.
Published: (2024)

XBench: A Comprehensive Benchmark for Visual-Language Explanations in Chest Radiography
by: Luo, Haozhe, et al.
Published: (2025)

LitXBench: A Benchmark for Extracting Experiments from Scientific Literature
by: Chong, Curtis, et al.
Published: (2026)

See, Plan, Cut: MPC-Based Autonomous Volumetric Robotic Laser Surgery with OCT Guidance
by: Prakash, Ravi, et al.
Published: (2025)

Where is the Boundary: Multimodal Sensor Fusion Test Bench for Tissue Boundary Delineation
by: Chen, Zacharias, et al.
Published: (2025)

FaceXBench: Evaluating Multimodal LLMs on Face Understanding
by: Narayan, Kartik, et al.
Published: (2025)

ArchXBench: A Complex Digital Systems Benchmark Suite for LLM Driven RTL Synthesis
by: Purini, Suresh, et al.
Published: (2025)

SurgVLM: A Large Vision-Language Model and Systematic Evaluation Benchmark for Surgical Intelligence
by: Zeng, Zhitao, et al.
Published: (2025)

TDBench: A Benchmark for Top-Down Image Understanding with Reliability Analysis of Vision-Language Models
by: Hou, Kaiyuan, et al.
Published: (2025)

CataractSurg-80K: Knowledge-Driven Benchmarking for Structured Reasoning in Ophthalmic Surgery Planning
by: Meng, Yang, et al.
Published: (2025)

SurgPub-Video: A Comprehensive Surgical Video Dataset for Enhanced Surgical Intelligence in Vision-Language Model
by: Li, Yaoqian, et al.
Published: (2025)

Computer Vision for Increased Operative Efficiency via Identification of Instruments in the Neurosurgical Operating Room: A Proof-of-Concept Study
by: Zachem, Tanner J., et al.
Published: (2023)

Trajectory Adaptation using Large Language Models
by: Maurya, Anurag, et al.
Published: (2025)

Bridging Vision and Language for Robust Context-Aware Surgical Point Tracking: The VL-SurgPT Dataset and Benchmark
by: Zhou, Rulin, et al.
Published: (2025)

SurgCheck: Do Vision-Language Models Really Look at Images in Surgical VQA?
by: Shin, Jongmin, et al.
Published: (2026)

Robust Uncertainty Quantification for Self-Evolving Large Language Models via Continual Domain Pretraining
by: Zhou, Xiaofan, et al.
Published: (2025)

GynSurg: A Comprehensive Gynecology Laparoscopic Surgery Dataset
by: Nasirihaghighi, Sahar, et al.
Published: (2025)

Certification and Classification of Linear Quantum Error Mitigation Methods
by: Blunden-Codd, Zach, et al.
Published: (2025)

SurgMLLMBench: A Multimodal Large Language Model Benchmark Dataset for Surgical Scene Understanding
by: Choi, Tae-Min, et al.
Published: (2025)

Tokensome: Towards a Genetic Vision-Language GPT for Explainable and Cognitive Karyotyping
by: Zhang, Haoxi, et al.
Published: (2024)

BAA-NGP: Bundle-Adjusting Accelerated Neural Graphics Primitives
by: Liu, Sainan, et al.
Published: (2023)

Benchmarking Attribute Discrimination in Infant-Scale Vision-Language Models
by: Batsell, Patrick, et al.
Published: (2025)

debiaSAE: Benchmarking and Mitigating Vision-Language Model Bias
by: Sasse, Kuleen, et al.
Published: (2024)

SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models
by: Lin, Bo, et al.
Published: (2024)

Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability
by: Shu, Dong, et al.
Published: (2025)

SurgX: Neuron-Concept Association for Explainable Surgical Phase Recognition
by: Kim, Ka Young, et al.
Published: (2025)

SurgBox: Agent-Driven Operating Room Sandbox with Surgery Copilot
by: Wu, Jinlin, et al.
Published: (2024)

X-Driver: Explainable Autonomous Driving with Vision-Language Models
by: Liu, Wei, et al.
Published: (2025)

OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Model
by: Chen, Qiguang, et al.
Published: (2026)

Meta-Adapter: An Online Few-shot Learner for Vision-Language Model
by: Cheng, Cheng, et al.
Published: (2023)

EVLF-FM: Explainable Vision Language Foundation Model for Medicine
by: Bai, Yang, et al.
Published: (2025)

LatBot: Distilling Universal Latent Actions for Vision-Language-Action Models
by: Li, Zuolei, et al.
Published: (2025)

BELL: Benchmarking the Explainability of Large Language Models
by: Ahmed, Syed Quiser, et al.
Published: (2025)

GroundedSurg: A Multi-Procedure Benchmark for Language-Conditioned Surgical Tool Segmentation
by: Ashraf, Tajamul, et al.
Published: (2026)

MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly
by: Wang, Zhaowei, et al.
Published: (2025)

On the Explainability of Vision-Language Models in Art History
by: Schneider, Stefanie
Published: (2026)

Patient‐derived organoid model of olfactory ensheathing cell tumor
by: John B. Finlay, et al.
Published: (2024)

Faculty Development Program on "Next-Generation Computing: Trends and Challenges organized by Sharda University Greater Noida
by: Chaturvedi, Ravi Prakash
Published: (2025)