Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Sierra, Begoña B., McLean, Colin, Hall, Peter S., Vallejos, Catalina A.
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Applications
Online Access:	https://arxiv.org/abs/2508.14821
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913998712078336
author	Sierra, Begoña B. McLean, Colin Hall, Peter S. Vallejos, Catalina A.
author_facet	Sierra, Begoña B. McLean, Colin Hall, Peter S. Vallejos, Catalina A.
contents	Quantifying out-of-sample discrimination performance for time-to-event outcomes is a fundamental step for model evaluation and selection in the context of predictive modelling. The concordance index, or C-index, is a widely used metric for this purpose, particularly with the growing development of machine learning methods. Beyond differences between proposed C-index estimators (e.g. Harrell's, Uno's and Antolini's), we demonstrate the existence of a C-index multiverse among available R and python software, where seemingly equal implementations can yield different results. This can undermine reproducibility and complicate fair comparisons across models and studies. Key variation sources include tie handling and adjustment to censoring. Additionally, the absence of a standardised approach to summarise risk from survival distributions, result in another source of variation dependent on input types. We demonstrate the consequences of the C-index multiverse when quantifying predictive performance for several survival models (from Cox proportional hazards to recent deep learning approaches) on publicly available breast cancer data, and semi-synthetic examples. Our work emphasises the need for better reporting to improve transparency and reproducibility. This article aims to be a useful guideline, helping analysts when navigating the multiverse, providing unified documentation and highlighting potential pitfalls of existing software. All code is publicly available at: www.github.com/BBolosSierra/CindexMultiverse.
format	Preprint
id	arxiv_https___arxiv_org_abs_2508_14821
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	The C-index Multiverse Sierra, Begoña B. McLean, Colin Hall, Peter S. Vallejos, Catalina A. Machine Learning Applications Quantifying out-of-sample discrimination performance for time-to-event outcomes is a fundamental step for model evaluation and selection in the context of predictive modelling. The concordance index, or C-index, is a widely used metric for this purpose, particularly with the growing development of machine learning methods. Beyond differences between proposed C-index estimators (e.g. Harrell's, Uno's and Antolini's), we demonstrate the existence of a C-index multiverse among available R and python software, where seemingly equal implementations can yield different results. This can undermine reproducibility and complicate fair comparisons across models and studies. Key variation sources include tie handling and adjustment to censoring. Additionally, the absence of a standardised approach to summarise risk from survival distributions, result in another source of variation dependent on input types. We demonstrate the consequences of the C-index multiverse when quantifying predictive performance for several survival models (from Cox proportional hazards to recent deep learning approaches) on publicly available breast cancer data, and semi-synthetic examples. Our work emphasises the need for better reporting to improve transparency and reproducibility. This article aims to be a useful guideline, helping analysts when navigating the multiverse, providing unified documentation and highlighting potential pitfalls of existing software. All code is publicly available at: www.github.com/BBolosSierra/CindexMultiverse.
title	The C-index Multiverse
topic	Machine Learning Applications
url	https://arxiv.org/abs/2508.14821

Similar Items