Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Becker, Katinka, Oppelt, Maximilian P., Zech, Tobias S., Seyferth, Martin, Cabon, Sandie, Miskovic, Vanja, Cimrak, Ivan, Kozubek, Michal, D'Avenio, Giuseppe, Campioni, Ilaria, Fehr, Jana, De, Kanjar, Mahmoudi, Ismail, Cantu, Emilio Dolgener, Ottmann, Laurenz, Klaß, Andreas, Altares, Galaad, Ma, Jackie, M., Alireza Salehi, Lang-Richter, Nadine R., Schaeffter, Tobias, Schwabe, Daniel
Format: Preprint
Veröffentlicht: 2026
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2601.22702
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866912862569496576
author Becker, Katinka
Oppelt, Maximilian P.
Zech, Tobias S.
Seyferth, Martin
Cabon, Sandie
Miskovic, Vanja
Cimrak, Ivan
Kozubek, Michal
D'Avenio, Giuseppe
Campioni, Ilaria
Fehr, Jana
De, Kanjar
Mahmoudi, Ismail
Cantu, Emilio Dolgener
Ottmann, Laurenz
Klaß, Andreas
Altares, Galaad
Ma, Jackie
M., Alireza Salehi
Lang-Richter, Nadine R.
Schaeffter, Tobias
Schwabe, Daniel
author_facet Becker, Katinka
Oppelt, Maximilian P.
Zech, Tobias S.
Seyferth, Martin
Cabon, Sandie
Miskovic, Vanja
Cimrak, Ivan
Kozubek, Michal
D'Avenio, Giuseppe
Campioni, Ilaria
Fehr, Jana
De, Kanjar
Mahmoudi, Ismail
Cantu, Emilio Dolgener
Ottmann, Laurenz
Klaß, Andreas
Altares, Galaad
Ma, Jackie
M., Alireza Salehi
Lang-Richter, Nadine R.
Schaeffter, Tobias
Schwabe, Daniel
contents Machine learning (ML) in medicine has transitioned from research to concrete applications aimed at supporting several medical purposes like therapy selection, monitoring and treatment. Acceptance and effective adoption by clinicians and patients, as well as regulatory approval, require evidence of trustworthiness. A major factor for the development of trustworthy AI is the quantification of data quality for AI model training and testing. We have recently proposed the METRIC-framework for systematically evaluating the suitability (fit-for-purpose) of data for medical ML for a given task. Here, we operationalize this theoretical framework by introducing a collection of data quality metrics - the metric library - for practically measuring data quality dimensions. For each metric, we provide a metric card with the most important information, including definition, applicability, examples, pitfalls and recommendations, to support the understanding and implementation of these metrics. Furthermore, we discuss strategies and provide decision trees for choosing an appropriate set of data quality metrics from the metric library given specific use cases. We demonstrate the impact of our approach exemplarily on the PTB-XL ECG-dataset. This is a first step to enable fit-for-purpose evaluation of training and test data in practice as the base for establishing trustworthy AI in medicine.
format Preprint
id arxiv_https___arxiv_org_abs_2601_22702
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Metric Hub: A metric library and practical selection workflow for use-case-driven data quality assessment in medical AI
Becker, Katinka
Oppelt, Maximilian P.
Zech, Tobias S.
Seyferth, Martin
Cabon, Sandie
Miskovic, Vanja
Cimrak, Ivan
Kozubek, Michal
D'Avenio, Giuseppe
Campioni, Ilaria
Fehr, Jana
De, Kanjar
Mahmoudi, Ismail
Cantu, Emilio Dolgener
Ottmann, Laurenz
Klaß, Andreas
Altares, Galaad
Ma, Jackie
M., Alireza Salehi
Lang-Richter, Nadine R.
Schaeffter, Tobias
Schwabe, Daniel
Machine Learning
Machine learning (ML) in medicine has transitioned from research to concrete applications aimed at supporting several medical purposes like therapy selection, monitoring and treatment. Acceptance and effective adoption by clinicians and patients, as well as regulatory approval, require evidence of trustworthiness. A major factor for the development of trustworthy AI is the quantification of data quality for AI model training and testing. We have recently proposed the METRIC-framework for systematically evaluating the suitability (fit-for-purpose) of data for medical ML for a given task. Here, we operationalize this theoretical framework by introducing a collection of data quality metrics - the metric library - for practically measuring data quality dimensions. For each metric, we provide a metric card with the most important information, including definition, applicability, examples, pitfalls and recommendations, to support the understanding and implementation of these metrics. Furthermore, we discuss strategies and provide decision trees for choosing an appropriate set of data quality metrics from the metric library given specific use cases. We demonstrate the impact of our approach exemplarily on the PTB-XL ECG-dataset. This is a first step to enable fit-for-purpose evaluation of training and test data in practice as the base for establishing trustworthy AI in medicine.
title Metric Hub: A metric library and practical selection workflow for use-case-driven data quality assessment in medical AI
topic Machine Learning
url https://arxiv.org/abs/2601.22702