Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yang, Fan, Bodic, Pierre Le, Kamp, Michael, Boley, Mario
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2402.15691
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929256282456064
author	Yang, Fan Bodic, Pierre Le Kamp, Michael Boley, Mario
author_facet	Yang, Fan Bodic, Pierre Le Kamp, Michael Boley, Mario
contents	Gradient boosting of prediction rules is an efficient approach to learn potentially interpretable yet accurate probabilistic models. However, actual interpretability requires to limit the number and size of the generated rules, and existing boosting variants are not designed for this purpose. Though corrective boosting refits all rule weights in each iteration to minimise prediction risk, the included rule conditions tend to be sub-optimal, because commonly used objective functions fail to anticipate this refitting. Here, we address this issue by a new objective function that measures the angle between the risk gradient vector and the projection of the condition output vector onto the orthogonal complement of the already selected conditions. This approach correctly approximate the ideal update of adding the risk gradient itself to the model and favours the inclusion of more general and thus shorter rules. As we demonstrate using a wide range of prediction tasks, this significantly improves the comprehensibility/accuracy trade-off of the fitted ensemble. Additionally, we show how objective values for related rule conditions can be computed incrementally to avoid any substantial computational overhead of the new method.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_15691
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Orthogonal Gradient Boosting for Simpler Additive Rule Ensembles Yang, Fan Bodic, Pierre Le Kamp, Michael Boley, Mario Machine Learning Gradient boosting of prediction rules is an efficient approach to learn potentially interpretable yet accurate probabilistic models. However, actual interpretability requires to limit the number and size of the generated rules, and existing boosting variants are not designed for this purpose. Though corrective boosting refits all rule weights in each iteration to minimise prediction risk, the included rule conditions tend to be sub-optimal, because commonly used objective functions fail to anticipate this refitting. Here, we address this issue by a new objective function that measures the angle between the risk gradient vector and the projection of the condition output vector onto the orthogonal complement of the already selected conditions. This approach correctly approximate the ideal update of adding the risk gradient itself to the model and favours the inclusion of more general and thus shorter rules. As we demonstrate using a wide range of prediction tasks, this significantly improves the comprehensibility/accuracy trade-off of the fitted ensemble. Additionally, we show how objective values for related rule conditions can be computed incrementally to avoid any substantial computational overhead of the new method.
title	Orthogonal Gradient Boosting for Simpler Additive Rule Ensembles
topic	Machine Learning
url	https://arxiv.org/abs/2402.15691

Similar Items