Saved in:
| Main Authors: | Li, Bohan, Yang, Shuojue, Peng, Baorui, Guo, Xianda, Zhang, Erli, Tao, Youqi, Duan, Junfeng, Xu, Daguang, Dou, Qi, Jin, Xin, Zeng, Wenjun, Zhao, Hao, Jin, Yueming |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.08712 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Deform3DGS: Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting
by: Yang, Shuojue, et al.
Published: (2024)
by: Yang, Shuojue, et al.
Published: (2024)
Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning
by: Liu, Haofeng, et al.
Published: (2024)
by: Liu, Haofeng, et al.
Published: (2024)
SurfSurg6D: Geometry Consistent Dense Correspondence for Textureless Surgical Instrument Pose Estimation
by: Shen, Daiyun, et al.
Published: (2026)
by: Shen, Daiyun, et al.
Published: (2026)
ToolTipNet: A Segmentation-Driven Deep Learning Baseline for Surgical Instrument Tip Detection
by: Wu, Zijian, et al.
Published: (2025)
by: Wu, Zijian, et al.
Published: (2025)
Free-DyGS: Camera-Pose-Free Scene Reconstruction for Dynamic Surgical Videos with Gaussian Splatting
by: Li, Qian, et al.
Published: (2024)
by: Li, Qian, et al.
Published: (2024)
OmniNWM: Omniscient Driving Navigation World Models
by: Li, Bohan, et al.
Published: (2025)
by: Li, Bohan, et al.
Published: (2025)
Instrument-Splatting: Controllable Photorealistic Reconstruction of Surgical Instruments Using Gaussian Splatting
by: Yang, Shuojue, et al.
Published: (2025)
by: Yang, Shuojue, et al.
Published: (2025)
BCRNet: Enhancing Landmark Detection in Laparoscopic Liver Surgery via Bezier Curve Refinement
by: Li, Qian, et al.
Published: (2025)
by: Li, Qian, et al.
Published: (2025)
Instrument-Splatting++: Towards Controllable Surgical Instrument Digital Twin Using Gaussian Splatting
by: Yang, Shuojue, et al.
Published: (2026)
by: Yang, Shuojue, et al.
Published: (2026)
SurgVLM: A Large Vision-Language Model and Systematic Evaluation Benchmark for Surgical Intelligence
by: Zeng, Zhitao, et al.
Published: (2025)
by: Zeng, Zhitao, et al.
Published: (2025)
ReWorld: Multi-Dimensional Reward Modeling for Embodied World Models
by: Peng, Baorui, et al.
Published: (2026)
by: Peng, Baorui, et al.
Published: (2026)
Cosmos-H-Surgical: Learning Surgical Robot Policies from Videos via World Modeling
by: He, Yufan, et al.
Published: (2025)
by: He, Yufan, et al.
Published: (2025)
SurgCalib: Gaussian Splatting-Based Hand-Eye Calibration for Robot-Assisted Minimally Invasive Surgery
by: Wu, Zijian, et al.
Published: (2026)
by: Wu, Zijian, et al.
Published: (2026)
Spatio-Temporal Representation Decoupling and Enhancement for Federated Instrument Segmentation in Surgical Videos
by: Fang, Zheng, et al.
Published: (2025)
by: Fang, Zheng, et al.
Published: (2025)
Generalized Recognition of Basic Surgical Actions Enables Skill Assessment and Vision-Language-Model-based Surgical Planning
by: Xu, Mengya, et al.
Published: (2026)
by: Xu, Mengya, et al.
Published: (2026)
Surg$Σ$: A Spectrum of Large-Scale Multimodal Data and Foundation Models for Surgical Intelligence
by: Zeng, Zhitao, et al.
Published: (2026)
by: Zeng, Zhitao, et al.
Published: (2026)
Articulated Kinematics Distillation from Video Diffusion Models
by: Li, Xuan, et al.
Published: (2025)
by: Li, Xuan, et al.
Published: (2025)
EA-WM: Event-Aware Generative World Model with Structured Kinematic-to-Visual Action Fields
by: Yang, Zhaoyang, et al.
Published: (2026)
by: Yang, Zhaoyang, et al.
Published: (2026)
Think Step by Step: Chain-of-Gesture Prompting for Error Detection in Robotic Surgical Videos
by: Shao, Zhimin, et al.
Published: (2024)
by: Shao, Zhimin, et al.
Published: (2024)
SurgRAW: Multi-Agent Workflow with Chain of Thought Reasoning for Robotic Surgical Video Analysis
by: Low, Chang Han, et al.
Published: (2025)
by: Low, Chang Han, et al.
Published: (2025)
Systematic Evaluation and Guidelines for Segment Anything Model in Surgical Video Analysis
by: Yuan, Cheng, et al.
Published: (2024)
by: Yuan, Cheng, et al.
Published: (2024)
Hierarchical Context Alignment with Disentangled Geometric and Temporal Modeling for Semantic Occupancy Prediction
by: Li, Bohan, et al.
Published: (2024)
by: Li, Bohan, et al.
Published: (2024)
Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement Learning
by: Wang, Qi, et al.
Published: (2023)
by: Wang, Qi, et al.
Published: (2023)
Kinematic-Based Assessment of Surgical Actions in Microanastomosis
by: Meng, Yan, et al.
Published: (2025)
by: Meng, Yan, et al.
Published: (2025)
Closed-Loop Unsupervised Representation Disentanglement with $β$-VAE Distillation and Diffusion Probabilistic Feedback
by: Jin, Xin, et al.
Published: (2024)
by: Jin, Xin, et al.
Published: (2024)
Surgical Action Planning with Large Language Models
by: Xu, Mengya, et al.
Published: (2025)
by: Xu, Mengya, et al.
Published: (2025)
Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion
by: Li, Bohan, et al.
Published: (2024)
by: Li, Bohan, et al.
Published: (2024)
One at a Time: Progressive Multi-step Volumetric Probability Learning for Reliable 3D Scene Perception
by: Li, Bohan, et al.
Published: (2023)
by: Li, Bohan, et al.
Published: (2023)
NaviNeRF: NeRF-based 3D Representation Disentanglement by Latent Semantic Navigation
by: Xie, Baao, et al.
Published: (2023)
by: Xie, Baao, et al.
Published: (2023)
Temporally Guided Articulated Hand Pose Tracking in Surgical Videos
by: Louis, Nathan, et al.
Published: (2021)
by: Louis, Nathan, et al.
Published: (2021)
SAW: Toward a Surgical Action World Model via Controllable and Scalable Video Generation
by: Rapuri, Sampath, et al.
Published: (2026)
by: Rapuri, Sampath, et al.
Published: (2026)
SurgFed: Language-guided Multi-Task Federated Learning for Surgical Video Understanding
by: Fang, Zheng, et al.
Published: (2026)
by: Fang, Zheng, et al.
Published: (2026)
SurgiPose: Estimating Surgical Tool Kinematics from Monocular Video for Surgical Robot Learning
by: Chen, Juo-Tung, et al.
Published: (2025)
by: Chen, Juo-Tung, et al.
Published: (2025)
ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking
by: Liu, Haofeng, et al.
Published: (2025)
by: Liu, Haofeng, et al.
Published: (2025)
AU-vMAE: Knowledge-Guide Action Units Detection via Video Masked Autoencoder
by: Jin, Qiaoqiao, et al.
Published: (2024)
by: Jin, Qiaoqiao, et al.
Published: (2024)
RefDecoder: Enhancing Visual Generation with Conditional Video Decoding
by: Fan, Xiang, et al.
Published: (2026)
by: Fan, Xiang, et al.
Published: (2026)
VAGPO: Vision-augmented Asymmetric Group Preference Optimization for Graph Routing Problems
by: Liu, Shiyan, et al.
Published: (2025)
by: Liu, Shiyan, et al.
Published: (2025)
SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking
by: Liu, Haofeng, et al.
Published: (2025)
by: Liu, Haofeng, et al.
Published: (2025)
MViewRouter: Internalizing Geometric Equivariance via Multi-view Alternating Attention for Combinatorial Routing
by: Liu, Shiyan, et al.
Published: (2026)
by: Liu, Shiyan, et al.
Published: (2026)
CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation
by: Luo, Xiangyang, et al.
Published: (2026)
by: Luo, Xiangyang, et al.
Published: (2026)
Similar Items
-
Deform3DGS: Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting
by: Yang, Shuojue, et al.
Published: (2024) -
Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning
by: Liu, Haofeng, et al.
Published: (2024) -
SurfSurg6D: Geometry Consistent Dense Correspondence for Textureless Surgical Instrument Pose Estimation
by: Shen, Daiyun, et al.
Published: (2026) -
ToolTipNet: A Segmentation-Driven Deep Learning Baseline for Surgical Instrument Tip Detection
by: Wu, Zijian, et al.
Published: (2025) -
Free-DyGS: Camera-Pose-Free Scene Reconstruction for Dynamic Surgical Videos with Gaussian Splatting
by: Li, Qian, et al.
Published: (2024)