Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Heidari, Moein, Bozorgpour, Afshin, Zarif-Fakharnia, AmirHossein, Chen, Wenjin, Merhof, Dorit, Foran, David J, Grewal, Jasmine, Hacihaliloglu, Ilker
Format:	Preprint
Published:	2025
Subjects:	Image and Video Processing Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2503.17543
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910053204754432
author	Heidari, Moein Bozorgpour, Afshin Zarif-Fakharnia, AmirHossein Chen, Wenjin Merhof, Dorit Foran, David J Grewal, Jasmine Hacihaliloglu, Ilker
author_facet	Heidari, Moein Bozorgpour, Afshin Zarif-Fakharnia, AmirHossein Chen, Wenjin Merhof, Dorit Foran, David J Grewal, Jasmine Hacihaliloglu, Ilker
contents	Objective To develop a robust and computationally efficient deep learning model for automated left ventricular ejection fraction (LVEF) estimation from echocardiography videos that is suitable for real-time point-of-care ultrasound (POCUS) deployment. Methods We propose Echo-E$^3$Net, an endocardial spatio-temporal network that explicitly incorporates cardiac anatomy into LVEF prediction. The model comprises a dual-phase Endocardial Border Detector (E$^2$CBD) that uses phase-specific cross attention to localize end-diastolic and end-systolic endocardial landmarks and to learn phase-aware landmark embeddings, and an Endocardial Feature Aggregator (E$^2$FA) that fuses these embeddings with global statistical descriptors of deep feature maps to refine EF regression. Training is guided by a multi-component loss inspired by Simpson's biplane method that jointly supervises EF and landmark geometry. We evaluate Echo-E$^3$Net on the EchoNet-Dynamic dataset using RMSE and R$^2$ while reporting parameter count and GFLOPs to characterize efficiency. Results On EchoNet-Dynamic, Echo-E$^3$Net achieves an RMSE of 5.20 and an R$^2$ score of 0.82 while using only 1.55M parameters and 8.05 GFLOPs. The model operates without external pre-training, heavy data augmentation, or test-time ensembling, supporting practical real-time deployment. Conclusion By combining phase-aware endocardial landmark modeling with lightweight spatio-temporal feature aggregation, Echo-E$^3$Net improves the efficiency and robustness of automated LVEF estimation and is well-suited for scalable clinical use in POCUS settings. Code is available at https://github.com/moeinheidari7829/Echo-E3Net
format	Preprint
id	arxiv_https___arxiv_org_abs_2503_17543
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Echo-E$^3$Net: Efficient Endocardial Spatio-Temporal Network for Ejection Fraction Estimation Heidari, Moein Bozorgpour, Afshin Zarif-Fakharnia, AmirHossein Chen, Wenjin Merhof, Dorit Foran, David J Grewal, Jasmine Hacihaliloglu, Ilker Image and Video Processing Computer Vision and Pattern Recognition Objective To develop a robust and computationally efficient deep learning model for automated left ventricular ejection fraction (LVEF) estimation from echocardiography videos that is suitable for real-time point-of-care ultrasound (POCUS) deployment. Methods We propose Echo-E$^3$Net, an endocardial spatio-temporal network that explicitly incorporates cardiac anatomy into LVEF prediction. The model comprises a dual-phase Endocardial Border Detector (E$^2$CBD) that uses phase-specific cross attention to localize end-diastolic and end-systolic endocardial landmarks and to learn phase-aware landmark embeddings, and an Endocardial Feature Aggregator (E$^2$FA) that fuses these embeddings with global statistical descriptors of deep feature maps to refine EF regression. Training is guided by a multi-component loss inspired by Simpson's biplane method that jointly supervises EF and landmark geometry. We evaluate Echo-E$^3$Net on the EchoNet-Dynamic dataset using RMSE and R$^2$ while reporting parameter count and GFLOPs to characterize efficiency. Results On EchoNet-Dynamic, Echo-E$^3$Net achieves an RMSE of 5.20 and an R$^2$ score of 0.82 while using only 1.55M parameters and 8.05 GFLOPs. The model operates without external pre-training, heavy data augmentation, or test-time ensembling, supporting practical real-time deployment. Conclusion By combining phase-aware endocardial landmark modeling with lightweight spatio-temporal feature aggregation, Echo-E$^3$Net improves the efficiency and robustness of automated LVEF estimation and is well-suited for scalable clinical use in POCUS settings. Code is available at https://github.com/moeinheidari7829/Echo-E3Net
title	Echo-E$^3$Net: Efficient Endocardial Spatio-Temporal Network for Ejection Fraction Estimation
topic	Image and Video Processing Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2503.17543

Similar Items