Saved in:
Bibliographic Details
Main Authors: Zhou, Rulin, Wang, Guankun, Wang, An, Ma, Yujie, Ouyang, Lixin, Cui, Bolin, Li, Junyan, Zhu, Chaowei, Li, Mingyang, Chen, Ming, Zhong, Xiaopin, Lu, Peng, Wang, Jiankun, Liu, Xianming, Ren, Hongliang
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.20636
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911464553447424
author Zhou, Rulin
Wang, Guankun
Wang, An
Ma, Yujie
Ouyang, Lixin
Cui, Bolin
Li, Junyan
Zhu, Chaowei
Li, Mingyang
Chen, Ming
Zhong, Xiaopin
Lu, Peng
Wang, Jiankun
Liu, Xianming
Ren, Hongliang
author_facet Zhou, Rulin
Wang, Guankun
Wang, An
Ma, Yujie
Ouyang, Lixin
Cui, Bolin
Li, Junyan
Zhu, Chaowei
Li, Mingyang
Chen, Ming
Zhong, Xiaopin
Lu, Peng
Wang, Jiankun
Liu, Xianming
Ren, Hongliang
contents Accurate and stable field-of-view (FoV) guidance is critical for safe and efficient minimally invasive surgery, yet existing approaches often conflate visual attention estimation with downstream camera control or rely on direct object-centric assumptions. In this work, we formulate surgical attention tracking as a spatio-temporal learning problem and model surgeon focus as a dense attention heatmap, enabling continuous and interpretable frame-wise FoV guidance. We propose SurgAtt-Tracker, a holistic framework that robustly tracks surgical attention by exploiting temporal coherence through proposal-level reranking and motion-aware refinement, rather than direct regression. To support systematic training and evaluation, we introduce SurgAtt-1.16M, a large-scale benchmark with a clinically grounded annotation protocol that enables comprehensive heatmap-based attention analysis across procedures and institutions. Extensive experiments on multiple surgical datasets demonstrate that SurgAtt-Tracker consistently achieves state-of-the-art performance and strong robustness under occlusion, multi-instrument interference, and cross-domain settings. Beyond attention tracking, our approach provides a frame-wise FoV guidance signal that can directly support downstream robotic FoV planning and automatic camera control.
format Preprint
id arxiv_https___arxiv_org_abs_2602_20636
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle SurgAtt-Tracker: Online Surgical Attention Tracking via Temporal Proposal Reranking and Motion-Aware Refinement
Zhou, Rulin
Wang, Guankun
Wang, An
Ma, Yujie
Ouyang, Lixin
Cui, Bolin
Li, Junyan
Zhu, Chaowei
Li, Mingyang
Chen, Ming
Zhong, Xiaopin
Lu, Peng
Wang, Jiankun
Liu, Xianming
Ren, Hongliang
Computer Vision and Pattern Recognition
Artificial Intelligence
Accurate and stable field-of-view (FoV) guidance is critical for safe and efficient minimally invasive surgery, yet existing approaches often conflate visual attention estimation with downstream camera control or rely on direct object-centric assumptions. In this work, we formulate surgical attention tracking as a spatio-temporal learning problem and model surgeon focus as a dense attention heatmap, enabling continuous and interpretable frame-wise FoV guidance. We propose SurgAtt-Tracker, a holistic framework that robustly tracks surgical attention by exploiting temporal coherence through proposal-level reranking and motion-aware refinement, rather than direct regression. To support systematic training and evaluation, we introduce SurgAtt-1.16M, a large-scale benchmark with a clinically grounded annotation protocol that enables comprehensive heatmap-based attention analysis across procedures and institutions. Extensive experiments on multiple surgical datasets demonstrate that SurgAtt-Tracker consistently achieves state-of-the-art performance and strong robustness under occlusion, multi-instrument interference, and cross-domain settings. Beyond attention tracking, our approach provides a frame-wise FoV guidance signal that can directly support downstream robotic FoV planning and automatic camera control.
title SurgAtt-Tracker: Online Surgical Attention Tracking via Temporal Proposal Reranking and Motion-Aware Refinement
topic Computer Vision and Pattern Recognition
Artificial Intelligence
url https://arxiv.org/abs/2602.20636