Saved in:
Bibliographic Details
Main Authors: Acar, Ayberk, Smith, Mariana, Al-Zogbi, Lidia, Watts, Tanner, Li, Fangjie, Li, Hao, Yilmaz, Nural, Scheikl, Paul Maria, d'Almeida, Jesse F., Sharma, Susheela, Branscombe, Lauren, Ertop, Tayfun Efe, Webster III, Robert J., Oguz, Ipek, Kuntz, Alan, Krieger, Axel, Wu, Jie Ying
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.16263
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915206643318784
author Acar, Ayberk
Smith, Mariana
Al-Zogbi, Lidia
Watts, Tanner
Li, Fangjie
Li, Hao
Yilmaz, Nural
Scheikl, Paul Maria
d'Almeida, Jesse F.
Sharma, Susheela
Branscombe, Lauren
Ertop, Tayfun Efe
Webster III, Robert J.
Oguz, Ipek
Kuntz, Alan
Krieger, Axel
Wu, Jie Ying
author_facet Acar, Ayberk
Smith, Mariana
Al-Zogbi, Lidia
Watts, Tanner
Li, Fangjie
Li, Hao
Yilmaz, Nural
Scheikl, Paul Maria
d'Almeida, Jesse F.
Sharma, Susheela
Branscombe, Lauren
Ertop, Tayfun Efe
Webster III, Robert J.
Oguz, Ipek
Kuntz, Alan
Krieger, Axel
Wu, Jie Ying
contents Surgical automation requires precise guidance and understanding of the scene. Current methods in the literature rely on bulky depth cameras to create maps of the anatomy, however this does not translate well to space-limited clinical applications. Monocular cameras are small and allow minimally invasive surgeries in tight spaces but additional processing is required to generate 3D scene understanding. We propose a 3D mapping pipeline that uses only RGB images to create segmented point clouds of the target anatomy. To ensure the most precise reconstruction, we compare different structure from motion algorithms' performance on mapping the central airway obstructions, and test the pipeline on a downstream task of tumor resection. In several metrics, including post-procedure tissue model evaluation, our pipeline performs comparably to RGB-D cameras and, in some cases, even surpasses their performance. These promising results demonstrate that automation guidance can be achieved in minimally invasive procedures with monocular cameras. This study is a step toward the complete autonomy of surgical robots.
format Preprint
id arxiv_https___arxiv_org_abs_2503_16263
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle From Monocular Vision to Autonomous Action: Guiding Tumor Resection via 3D Reconstruction
Acar, Ayberk
Smith, Mariana
Al-Zogbi, Lidia
Watts, Tanner
Li, Fangjie
Li, Hao
Yilmaz, Nural
Scheikl, Paul Maria
d'Almeida, Jesse F.
Sharma, Susheela
Branscombe, Lauren
Ertop, Tayfun Efe
Webster III, Robert J.
Oguz, Ipek
Kuntz, Alan
Krieger, Axel
Wu, Jie Ying
Computer Vision and Pattern Recognition
Robotics
Surgical automation requires precise guidance and understanding of the scene. Current methods in the literature rely on bulky depth cameras to create maps of the anatomy, however this does not translate well to space-limited clinical applications. Monocular cameras are small and allow minimally invasive surgeries in tight spaces but additional processing is required to generate 3D scene understanding. We propose a 3D mapping pipeline that uses only RGB images to create segmented point clouds of the target anatomy. To ensure the most precise reconstruction, we compare different structure from motion algorithms' performance on mapping the central airway obstructions, and test the pipeline on a downstream task of tumor resection. In several metrics, including post-procedure tissue model evaluation, our pipeline performs comparably to RGB-D cameras and, in some cases, even surpasses their performance. These promising results demonstrate that automation guidance can be achieved in minimally invasive procedures with monocular cameras. This study is a step toward the complete autonomy of surgical robots.
title From Monocular Vision to Autonomous Action: Guiding Tumor Resection via 3D Reconstruction
topic Computer Vision and Pattern Recognition
Robotics
url https://arxiv.org/abs/2503.16263