Saved in:
| Main Authors: | Lin, Wenjun, Hu, Yan, Fu, Huazhu, Yang, Mingming, Chng, Chin-Boon, Kawasaki, Ryo, Chui, Cheekong, Liu, Jiang |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.00322 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Hierarchical Context Transformer for Multi-level Semantic Scene Understanding
by: Hao, Luoying, et al.
Published: (2025)
by: Hao, Luoying, et al.
Published: (2025)
ViDSOD-100: A New Dataset and a Baseline Model for RGB-D Video Salient Object Detection
by: Lin, Junhao, et al.
Published: (2024)
by: Lin, Junhao, et al.
Published: (2024)
Event-Level Detection of Surgical Instrument Handovers in Videos with Interpretable Vision Models
by: Katsarou, Katerina, et al.
Published: (2026)
by: Katsarou, Katerina, et al.
Published: (2026)
Vivim: a Video Vision Mamba for Medical Video Segmentation
by: Yang, Yijun, et al.
Published: (2024)
by: Yang, Yijun, et al.
Published: (2024)
Token Merging via Spatiotemporal Information Mining for Surgical Video Understanding
by: Jiang, Xixi, et al.
Published: (2025)
by: Jiang, Xixi, et al.
Published: (2025)
Out-Of-Distribution Detection with Diversification (Provably)
by: Yao, Haiyun, et al.
Published: (2024)
by: Yao, Haiyun, et al.
Published: (2024)
An Efficient Streaming Video Understanding Framework with Agentic Control
by: Liu, Jinming, et al.
Published: (2026)
by: Liu, Jinming, et al.
Published: (2026)
Memory-Augmented Multimodal LLMs for Surgical VQA via Self-Contained Inquiry
by: Hou, Wenjun, et al.
Published: (2024)
by: Hou, Wenjun, et al.
Published: (2024)
Rethinking Text-Promptable Surgical Instrument Segmentation with Robust Framework
by: Choi, Tae-Min, et al.
Published: (2024)
by: Choi, Tae-Min, et al.
Published: (2024)
Surgical Video Understanding with Label Interpolation
by: Kim, Garam, et al.
Published: (2025)
by: Kim, Garam, et al.
Published: (2025)
Is Dataset Quality Still a Concern in Diagnosis Using Large Foundation Model?
by: Lin, Ziqin, et al.
Published: (2024)
by: Lin, Ziqin, et al.
Published: (2024)
OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding
by: Hu, Ming, et al.
Published: (2024)
by: Hu, Ming, et al.
Published: (2024)
EgoSurgery-Tool: A Dataset of Surgical Tool and Hand Detection from Egocentric Open Surgery Videos
by: Fujii, Ryo, et al.
Published: (2024)
by: Fujii, Ryo, et al.
Published: (2024)
Neural shape reconstruction from multiple views with static pattern projection
by: Furukawa, Ryo, et al.
Published: (2025)
by: Furukawa, Ryo, et al.
Published: (2025)
Weakly Supervised YOLO Network for Surgical Instrument Localization in Endoscopic Videos
by: Wei, Rongfeng, et al.
Published: (2023)
by: Wei, Rongfeng, et al.
Published: (2023)
Instrument-Splatting: Controllable Photorealistic Reconstruction of Surgical Instruments Using Gaussian Splatting
by: Yang, Shuojue, et al.
Published: (2025)
by: Yang, Shuojue, et al.
Published: (2025)
SurgicalPart-SAM: Part-to-Whole Collaborative Prompting for Surgical Instrument Segmentation
by: Yue, Wenxi, et al.
Published: (2023)
by: Yue, Wenxi, et al.
Published: (2023)
Text-driven Multiplanar Visual Interaction for Semi-supervised Medical Image Segmentation
by: Huang, Kaiwen, et al.
Published: (2025)
by: Huang, Kaiwen, et al.
Published: (2025)
TrajPred: Trajectory-Conditioned Joint Embedding Prediction for Surgical Instrument-Tissue Interaction Recognition in Vision-Language Models
by: Cheng, Jiajun, et al.
Published: (2026)
by: Cheng, Jiajun, et al.
Published: (2026)
Video Dataset for Surgical Phase, Keypoint, and Instrument Recognition in Laparoscopic Surgery (PhaKIR)
by: Rueckert, Tobias, et al.
Published: (2025)
by: Rueckert, Tobias, et al.
Published: (2025)
Prompt Guiding Multi-Scale Adaptive Sparse Representation-driven Network for Low-Dose CT MAR
by: Shi, Baoshun, et al.
Published: (2025)
by: Shi, Baoshun, et al.
Published: (2025)
Cycle Context Verification for In-Context Medical Image Segmentation
by: Hu, Shishuai, et al.
Published: (2025)
by: Hu, Shishuai, et al.
Published: (2025)
Spatio-Temporal Representation Decoupling and Enhancement for Federated Instrument Segmentation in Surgical Videos
by: Fang, Zheng, et al.
Published: (2025)
by: Fang, Zheng, et al.
Published: (2025)
Training-free Detection and 6D Pose Estimation of Unseen Surgical Instruments
by: Hein, Jonas, et al.
Published: (2026)
by: Hein, Jonas, et al.
Published: (2026)
ToolTipNet: A Segmentation-Driven Deep Learning Baseline for Surgical Instrument Tip Detection
by: Wu, Zijian, et al.
Published: (2025)
by: Wu, Zijian, et al.
Published: (2025)
A Comprehensive Augmentation Framework for Anomaly Detection
by: Lin, Jiang, et al.
Published: (2023)
by: Lin, Jiang, et al.
Published: (2023)
Prompting Lipschitz-constrained network for multiple-in-one sparse-view CT reconstruction
by: Shi, Baoshun, et al.
Published: (2025)
by: Shi, Baoshun, et al.
Published: (2025)
SurgMotion: A Video-Native Foundation Model for Universal Understanding of Surgical Videos
by: Wu, Jinlin, et al.
Published: (2026)
by: Wu, Jinlin, et al.
Published: (2026)
Augmenting Efficient Real-time Surgical Instrument Segmentation in Video with Point Tracking and Segment Anything
by: Wu, Zijian, et al.
Published: (2024)
by: Wu, Zijian, et al.
Published: (2024)
Weakly Semi-supervised Tool Detection in Minimally Invasive Surgery Videos
by: Fujii, Ryo, et al.
Published: (2024)
by: Fujii, Ryo, et al.
Published: (2024)
Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding
by: Gao, Sensen, et al.
Published: (2025)
by: Gao, Sensen, et al.
Published: (2025)
ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon Intention Understanding
by: Chen, Zhen, et al.
Published: (2024)
by: Chen, Zhen, et al.
Published: (2024)
Real-Time Surgical Instrument Defect Detection via Non-Destructive Testing
by: Ain, Qurrat Ul, et al.
Published: (2025)
by: Ain, Qurrat Ul, et al.
Published: (2025)
VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding
by: Yang, Ruoliu, et al.
Published: (2026)
by: Yang, Ruoliu, et al.
Published: (2026)
Real-time Rendering-based Surgical Instrument Tracking via Evolutionary Optimization
by: Hu, Hanyang, et al.
Published: (2026)
by: Hu, Hanyang, et al.
Published: (2026)
Topicwise Separable Sentence Retrieval for Medical Report Generation
by: Zhao, Junting, et al.
Published: (2024)
by: Zhao, Junting, et al.
Published: (2024)
PathFL: Multi-Alignment Federated Learning for Pathology Image Segmentation
by: Zhang, Yuan, et al.
Published: (2025)
by: Zhang, Yuan, et al.
Published: (2025)
Vision-Language Model IP Protection via Prompt-based Learning
by: Wang, Lianyu, et al.
Published: (2025)
by: Wang, Lianyu, et al.
Published: (2025)
Amodal Segmentation for Laparoscopic Surgery Video Instruments
by: Shi, Ruohua, et al.
Published: (2024)
by: Shi, Ruohua, et al.
Published: (2024)
UniVRSE: Unified Vision-conditioned Response Semantic Entropy for Hallucination Detection in Medical Vision-Language Models
by: Liao, Zehui, et al.
Published: (2025)
by: Liao, Zehui, et al.
Published: (2025)
Similar Items
-
Hierarchical Context Transformer for Multi-level Semantic Scene Understanding
by: Hao, Luoying, et al.
Published: (2025) -
ViDSOD-100: A New Dataset and a Baseline Model for RGB-D Video Salient Object Detection
by: Lin, Junhao, et al.
Published: (2024) -
Event-Level Detection of Surgical Instrument Handovers in Videos with Interpretable Vision Models
by: Katsarou, Katerina, et al.
Published: (2026) -
Vivim: a Video Vision Mamba for Medical Video Segmentation
by: Yang, Yijun, et al.
Published: (2024) -
Token Merging via Spatiotemporal Information Mining for Surgical Video Understanding
by: Jiang, Xixi, et al.
Published: (2025)