Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Baesens, Bart, Goethals, Andreas, Lessmann, Stefan, De Vos, Simon, Bravo, Cristián, Martens, David, Medina-Olivares, Victor, Mues, Christophe, Oskarsdóttir, Maria, Broucke, Seppe vanden, Verdonck, Tim, Verbeke, Wouter
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2605.18147
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909054080647168
author	Baesens, Bart Goethals, Andreas Lessmann, Stefan De Vos, Simon Bravo, Cristián Martens, David Medina-Olivares, Victor Mues, Christophe Oskarsdóttir, Maria Broucke, Seppe vanden Verdonck, Tim Verbeke, Wouter
author_facet	Baesens, Bart Goethals, Andreas Lessmann, Stefan De Vos, Simon Bravo, Cristián Martens, David Medina-Olivares, Victor Mues, Christophe Oskarsdóttir, Maria Broucke, Seppe vanden Verdonck, Tim Verbeke, Wouter
contents	Predictive models play a pivotal role in credit risk management, guiding critical decisions through accurate estimation of default probabilities and losses. Extensive research has introduced new modeling techniques, complemented by large-scale benchmarking studies consolidating the state-of-the-art. Today, quasi-standards such as gradient-boosting models paired with SHAP explainers have emerged, yet continuous improvement of risk models remains a top priority. Concurrently, rapid advancements in AI, most notably large language models, have disrupted predictive modeling paradigms. Foundation models, pretrained on extensive datasets from diverse domains, have demonstrated remarkable performance by leveraging prior knowledge. While prevalent in natural language processing and computer vision, foundation models for tabular data have only recently emerged. We conjecture that pretraining on out-of-domain data is particularly beneficial in small-data settings, such as SME lending or specialized corporate portfolios, and may help address longstanding challenges including low default portfolios and class imbalance. This paper benchmarks recently proposed tabular foundation models against a broad set of competitors, including established and advanced machine learning techniques, across two core tasks: PD and LGD modeling. Our evaluation encompasses various datasets, performance indicators, and experimental conditions. We find that tabular foundation models generally perform best across datasets and tasks. Moreover, they offer significant improvement in predictive performance as dataset size shrinks. These results are remarkable given that the models are tested out-of-the-box, without hyperparameter tuning, ensuring ease of use and mitigating computational costs.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_18147
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Foundation Models for Credit Risk Prediction: A Game Changer? Baesens, Bart Goethals, Andreas Lessmann, Stefan De Vos, Simon Bravo, Cristián Martens, David Medina-Olivares, Victor Mues, Christophe Oskarsdóttir, Maria Broucke, Seppe vanden Verdonck, Tim Verbeke, Wouter Machine Learning Predictive models play a pivotal role in credit risk management, guiding critical decisions through accurate estimation of default probabilities and losses. Extensive research has introduced new modeling techniques, complemented by large-scale benchmarking studies consolidating the state-of-the-art. Today, quasi-standards such as gradient-boosting models paired with SHAP explainers have emerged, yet continuous improvement of risk models remains a top priority. Concurrently, rapid advancements in AI, most notably large language models, have disrupted predictive modeling paradigms. Foundation models, pretrained on extensive datasets from diverse domains, have demonstrated remarkable performance by leveraging prior knowledge. While prevalent in natural language processing and computer vision, foundation models for tabular data have only recently emerged. We conjecture that pretraining on out-of-domain data is particularly beneficial in small-data settings, such as SME lending or specialized corporate portfolios, and may help address longstanding challenges including low default portfolios and class imbalance. This paper benchmarks recently proposed tabular foundation models against a broad set of competitors, including established and advanced machine learning techniques, across two core tasks: PD and LGD modeling. Our evaluation encompasses various datasets, performance indicators, and experimental conditions. We find that tabular foundation models generally perform best across datasets and tasks. Moreover, they offer significant improvement in predictive performance as dataset size shrinks. These results are remarkable given that the models are tested out-of-the-box, without hyperparameter tuning, ensuring ease of use and mitigating computational costs.
title	Foundation Models for Credit Risk Prediction: A Game Changer?
topic	Machine Learning
url	https://arxiv.org/abs/2605.18147

Similar Items