:: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhou, Rulin, Wang, Guankun, Wang, An, Ma, Yujie, Ouyang, Lixin, Cui, Bolin, Li, Junyan, Zhu, Chaowei, Li, Mingyang, Chen, Ming, Zhong, Xiaopin, Lu, Peng, Wang, Jiankun, Liu, Xianming, Ren, Hongliang
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.20636
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Bridging Vision and Language for Robust Context-Aware Surgical Point Tracking: The VL-SurgPT Dataset and Benchmark
by: Zhou, Rulin, et al.
Published: (2025)

TSUBF-Net: Trans-Spatial UNet-like Network with Bi-direction Fusion for Segmentation of Adenoid Hypertrophy in CT
by: Zhou, Rulin, et al.
Published: (2024)

Adapting SAM for Surgical Instrument Tracking and Segmentation in Endoscopic Submucosal Dissection Videos
by: Yu, Jieming, et al.
Published: (2024)

SurgTPGS: Semantic 3D Surgical Scene Understanding with Text Promptable Gaussian Splatting
by: Huang, Yiming, et al.
Published: (2025)

Mask Focal Loss: A unifying framework for dense crowd counting with canonical object detection networks
by: Zhong, Xiaopin, et al.
Published: (2022)

Endo-TTAP: Robust Endoscopic Tissue Tracking via Multi-Facet Guided Attention and Hybrid Flow-point Supervision
by: Zhou, Rulin, et al.
Published: (2025)

SurgVidLM: Towards Multi-grained Surgical Video Understanding with Large Language Model
by: Wang, Guankun, et al.
Published: (2025)

Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust Visual Question-Localized Answering in Robotic Surgery
by: Bai, Long, et al.
Published: (2024)

How can reasoning capability empower the AI copilot robot in endoscopic surgery
by: Wang, Guankun, et al.
Published: (2026)

CoPESD: A Multi-Level Surgical Motion Dataset for Training Large Vision-Language Models to Co-Pilot Endoscopic Submucosal Dissection
by: Wang, Guankun, et al.
Published: (2024)

EndoControlMag: Robust Endoscopic Vascular Motion Magnification with Periodic Reference Resetting and Hierarchical Tissue-aware Dual-Mask Control
by: Wang, An, et al.
Published: (2025)

Geo-RepNet: Geometry-Aware Representation Learning for Surgical Phase Recognition in Endoscopic Submucosal Dissection
by: Tang, Rui, et al.
Published: (2025)

OSSAR: Towards Open-Set Surgical Activity Recognition in Robot-assisted Surgery
by: Bai, Long, et al.
Published: (2024)

SurgSora: Object-Aware Diffusion Model for Controllable Surgical Video Generation
by: Chen, Tong, et al.
Published: (2024)

BleedOrigin: Dynamic Bleeding Source Localization in Endoscopic Submucosal Dissection via Dual-Stage Detection and Tracking
by: Xu, Mengya, et al.
Published: (2025)

SurgMotion: A Video-Native Foundation Model for Universal Understanding of Surgical Videos
by: Wu, Jinlin, et al.
Published: (2026)

EndoOOD: Uncertainty-aware Out-of-distribution Detection in Capsule Endoscopy Diagnosis
by: Tan, Qiaozhi, et al.
Published: (2024)

TMR-VLA:Vision-Language-Action Model for Magnetic Motion Control of Tri-leg Silicone-based Soft Robot
by: Tang, Ruijie, et al.
Published: (2026)

Coarse-to-Fine Proposal Refinement Framework for Audio Temporal Forgery Detection and Localization
by: Wu, Junyan, et al.
Published: (2024)

LightFC-X: Lightweight Convolutional Tracker for RGB-X Tracking
by: Li, Yunfeng, et al.
Published: (2025)

EndoVLA: Dual-Phase Vision-Language-Action Model for Autonomous Tracking in Endoscopy
by: Ng, Chi Kit, et al.
Published: (2025)

DeTracker: Motion-decoupled Vehicle Detection and Tracking in Unstabilized Satellite Videos
by: Chen, Jiajun, et al.
Published: (2026)

Multimodal Graph Representation Learning for Robust Surgical Workflow Recognition with Adversarial Feature Disentanglement
by: Bai, Long, et al.
Published: (2025)

PDZSeg: Adapting the Foundation Model for Dissection Zone Segmentation with Visual Prompts in Robot-assisted Endoscopic Submucosal Dissection
by: Xu, Mengya, et al.
Published: (2024)

SurgTrack: CAD-Free 3D Tracking of Real-world Surgical Instruments
by: Guo, Wenwu, et al.
Published: (2024)

Surgical-LVLM: Learning to Adapt Large Vision-Language Model for Grounded Visual Question Answering in Robotic Surgery
by: Wang, Guankun, et al.
Published: (2024)

UniTracker: Learning Universal Whole-Body Motion Tracker for Humanoid Robots
by: Yin, Kangning, et al.
Published: (2025)

Dual-Rerank: Fusing Causality and Utility for Industrial Generative Reranking
by: Zhang, Chao, et al.
Published: (2026)

SurgPLAN++: Universal Surgical Phase Localization Network for Online and Offline Inference
by: Chen, Zhen, et al.
Published: (2024)

RGB-Sonar Tracking Benchmark and Spatial Cross-Attention Transformer Tracker
by: Li, Yunfeng, et al.
Published: (2024)

Intuitive Surgical SurgToolLoc and SurgVU Challenges Results: 2022-2025
by: Zia, Aneeq, et al.
Published: (2023)

Surgical Visual Understanding (SurgVU) Dataset
by: Zia, Aneeq, et al.
Published: (2025)

EndoARSS: Adapting Spatially-Aware Foundation Model for Efficient Activity Recognition and Semantic Segmentation in Endoscopic Surgery
by: Wang, Guankun, et al.
Published: (2025)

EndoARSS: Adapting Spatially Aware Foundation Model for Efficient Activity Recognition and Semantic Segmentation in Endoscopic Surgery
by: Guankun Wang, et al.
Published: (2025)

ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking
by: Liu, Haofeng, et al.
Published: (2025)

OmniTracker: Unifying Object Tracking by Tracking-with-Detection
by: Wang, Junke, et al.
Published: (2023)

Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression
by: Li, Daxin, et al.
Published: (2024)

Collision Risk Quantification and Conflict Resolution in Trajectory Tracking for Acceleration-Actuated Multi-Robot Systems
by: Li, Xiaoxiao, et al.
Published: (2025)

TimeTracker: Event-based Continuous Point Tracking for Video Frame Interpolation with Non-linear Motion
by: Liu, Haoyue, et al.
Published: (2025)

SurgPose: a Dataset for Articulated Robotic Surgical Tool Pose Estimation and Tracking
by: Wu, Zijian, et al.
Published: (2025)