Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Guo, Xuhui, Dam, Tanmoy, Dhamdhere, Rohan, Modanwal, Gourav, Madabhushi, Anant
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2501.07017
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916794678116352
author	Guo, Xuhui Dam, Tanmoy Dhamdhere, Rohan Modanwal, Gourav Madabhushi, Anant
author_facet	Guo, Xuhui Dam, Tanmoy Dhamdhere, Rohan Modanwal, Gourav Madabhushi, Anant
contents	3D medical image segmentation has progressed considerably due to Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), yet these methods struggle to balance long-range dependency acquisition with computational efficiency. To address this challenge, we propose UNETVL (U-Net Vision-LSTM), a novel architecture that leverages recent advancements in temporal information processing. UNETVL incorporates Vision-LSTM (ViL) for improved scalability and memory functions, alongside an efficient Chebyshev Kolmogorov-Arnold Networks (KAN) to handle complex and long-range dependency patterns more effectively. We validated our method on the ACDC and AMOS2022 (post challenge Task 2) benchmark datasets, showing a significant improvement in mean Dice score compared to recent state-of-the-art approaches, especially over its predecessor, UNETR, with increases of 7.3% on ACDC and 15.6% on AMOS, respectively. Extensive ablation studies were conducted to demonstrate the impact of each component in UNETVL, providing a comprehensive understanding of its architecture. Our code is available at https://github.com/tgrex6/UNETVL, facilitating further research and applications in this domain.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_07017
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	UNetVL: Enhancing 3D Medical Image Segmentation with Chebyshev KAN Powered Vision-LSTM Guo, Xuhui Dam, Tanmoy Dhamdhere, Rohan Modanwal, Gourav Madabhushi, Anant Computer Vision and Pattern Recognition Artificial Intelligence 3D medical image segmentation has progressed considerably due to Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), yet these methods struggle to balance long-range dependency acquisition with computational efficiency. To address this challenge, we propose UNETVL (U-Net Vision-LSTM), a novel architecture that leverages recent advancements in temporal information processing. UNETVL incorporates Vision-LSTM (ViL) for improved scalability and memory functions, alongside an efficient Chebyshev Kolmogorov-Arnold Networks (KAN) to handle complex and long-range dependency patterns more effectively. We validated our method on the ACDC and AMOS2022 (post challenge Task 2) benchmark datasets, showing a significant improvement in mean Dice score compared to recent state-of-the-art approaches, especially over its predecessor, UNETR, with increases of 7.3% on ACDC and 15.6% on AMOS, respectively. Extensive ablation studies were conducted to demonstrate the impact of each component in UNETVL, providing a comprehensive understanding of its architecture. Our code is available at https://github.com/tgrex6/UNETVL, facilitating further research and applications in this domain.
title	UNetVL: Enhancing 3D Medical Image Segmentation with Chebyshev KAN Powered Vision-LSTM
topic	Computer Vision and Pattern Recognition Artificial Intelligence
url	https://arxiv.org/abs/2501.07017

Similar Items