Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Filice, Francesca, De Rose, Edoardo, Bartucci, Simone, Calimeri, Francesco, Perri, Simona
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2601.21830
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912859521286144
author	Filice, Francesca De Rose, Edoardo Bartucci, Simone Calimeri, Francesco Perri, Simona
author_facet	Filice, Francesca De Rose, Edoardo Bartucci, Simone Calimeri, Francesco Perri, Simona
contents	The electrocardiogram (ECG) is a cost-effective, highly accessible and widely employed diagnostic tool. With the advent of Foundation Models (FMs), the field of AI-assisted ECG interpretation has begun to evolve, as they enable model reuse across different tasks by relying on embeddings. However, to responsibly employ FMs, it is crucial to rigorously assess to which extent the embeddings they produce are generalizable, particularly in error-sensitive domains such as healthcare. Although prior works have already addressed the problem of benchmarking ECG-expert FMs, they focus predominantly on the evaluation of downstream performance. To fill this gap, this study aims to find an in-depth, comprehensive benchmarking framework for FMs, with a specific focus on ECG-expert ones. To this aim, we introduce a benchmark methodology that complements performance-based evaluation with representation-level analysis, leveraging SHAP and UMAP techniques. Furthermore, we rely on the methodology for carrying out an extensive evaluation of several ECG-expert FMs pretrained via state-of-the-art techniques over different cross-continental datasets and data availability settings; this includes ones featuring data scarcity, a fairly common situation in real-world medical scenarios. Experimental results show that our benchmarking protocol provides a rich insight of ECG-expert FMs' embedded patterns, enabling a deeper understanding of their representational structure and generalizability.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_21830
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Looking Beyond Accuracy: A Holistic Benchmark of ECG Foundation Models Filice, Francesca De Rose, Edoardo Bartucci, Simone Calimeri, Francesco Perri, Simona Artificial Intelligence The electrocardiogram (ECG) is a cost-effective, highly accessible and widely employed diagnostic tool. With the advent of Foundation Models (FMs), the field of AI-assisted ECG interpretation has begun to evolve, as they enable model reuse across different tasks by relying on embeddings. However, to responsibly employ FMs, it is crucial to rigorously assess to which extent the embeddings they produce are generalizable, particularly in error-sensitive domains such as healthcare. Although prior works have already addressed the problem of benchmarking ECG-expert FMs, they focus predominantly on the evaluation of downstream performance. To fill this gap, this study aims to find an in-depth, comprehensive benchmarking framework for FMs, with a specific focus on ECG-expert ones. To this aim, we introduce a benchmark methodology that complements performance-based evaluation with representation-level analysis, leveraging SHAP and UMAP techniques. Furthermore, we rely on the methodology for carrying out an extensive evaluation of several ECG-expert FMs pretrained via state-of-the-art techniques over different cross-continental datasets and data availability settings; this includes ones featuring data scarcity, a fairly common situation in real-world medical scenarios. Experimental results show that our benchmarking protocol provides a rich insight of ECG-expert FMs' embedded patterns, enabling a deeper understanding of their representational structure and generalizability.
title	Looking Beyond Accuracy: A Holistic Benchmark of ECG Foundation Models
topic	Artificial Intelligence
url	https://arxiv.org/abs/2601.21830

Similar Items