Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Liu, Brian, Mazumder, Rahul, Radchenko, Peter
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2506.20114
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914432117899264
author	Liu, Brian Mazumder, Rahul Radchenko, Peter
author_facet	Liu, Brian Mazumder, Rahul Radchenko, Peter
contents	Tree ensembles are non-parametric methods widely recognized for their accuracy and ability to capture complex interactions. While these models excel at prediction, they are difficult to interpret and may fail to uncover useful relationships in the data. We propose an estimator to extract compact sets of decision rules from tree ensembles. The extracted models are accurate and can be manually examined to reveal relationships between the predictors and the response. A key novelty of our estimator is the flexibility to jointly control the number of rules extracted and the interaction depth of each rule, which improves accuracy. We develop a tailored exact algorithm to efficiently solve optimization problems underlying our estimator and an approximate algorithm for computing regularization paths, sequences of solutions that correspond to varying model sizes. We also establish novel non-asymptotic prediction error bounds for our proposed approach, comparing it to an oracle that chooses the best data-dependent linear combination of the rules in the ensemble subject to the same complexity constraint as our estimator. The bounds illustrate that the large-sample predictive performance of our estimator is on par with that of the oracle. Through experiments, we demonstrate that our estimator outperforms existing algorithms for rule extraction.
format	Preprint
id	arxiv_https___arxiv_org_abs_2506_20114
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Extracting Interpretable Models from Tree Ensembles: Computational and Statistical Perspectives Liu, Brian Mazumder, Rahul Radchenko, Peter Machine Learning Tree ensembles are non-parametric methods widely recognized for their accuracy and ability to capture complex interactions. While these models excel at prediction, they are difficult to interpret and may fail to uncover useful relationships in the data. We propose an estimator to extract compact sets of decision rules from tree ensembles. The extracted models are accurate and can be manually examined to reveal relationships between the predictors and the response. A key novelty of our estimator is the flexibility to jointly control the number of rules extracted and the interaction depth of each rule, which improves accuracy. We develop a tailored exact algorithm to efficiently solve optimization problems underlying our estimator and an approximate algorithm for computing regularization paths, sequences of solutions that correspond to varying model sizes. We also establish novel non-asymptotic prediction error bounds for our proposed approach, comparing it to an oracle that chooses the best data-dependent linear combination of the rules in the ensemble subject to the same complexity constraint as our estimator. The bounds illustrate that the large-sample predictive performance of our estimator is on par with that of the oracle. Through experiments, we demonstrate that our estimator outperforms existing algorithms for rule extraction.
title	Extracting Interpretable Models from Tree Ensembles: Computational and Statistical Perspectives
topic	Machine Learning
url	https://arxiv.org/abs/2506.20114

Similar Items