Saved in:
Bibliographic Details
Main Authors: Yajima, Haruki, Matsui, Yusuke
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.28068
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910271350505472
author Yajima, Haruki
Matsui, Yusuke
author_facet Yajima, Haruki
Matsui, Yusuke
contents Tree ensembles are machine learning models with strong predictive performance and interpretability, and remain widely used for tabular data. Standard pruning methods for tree ensembles typically optimize an accuracy-compression trade-off and may change a subset of predictions, potentially compromising decision consistency. Faithful pruning methods address this issue by preserving prediction equivalence over the entire input space, but this requirement leads to lower compression ratios. We propose PINE, a pruning method that provides strong guarantees within an in-distribution region. PINE preserves prediction equivalence within this region and controls the region size using a single parameter $α$ via conformal calibration. Experiments on 12 public tabular datasets show that PINE improves the compression ratio by up to 30% while preserving predictions at a comparable level to existing faithful pruning methods.
format Preprint
id arxiv_https___arxiv_org_abs_2605_28068
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle PINE: Pruning Boosted Tree Ensembles with Conformal In-Distribution Prediction Equivalence
Yajima, Haruki
Matsui, Yusuke
Machine Learning
Tree ensembles are machine learning models with strong predictive performance and interpretability, and remain widely used for tabular data. Standard pruning methods for tree ensembles typically optimize an accuracy-compression trade-off and may change a subset of predictions, potentially compromising decision consistency. Faithful pruning methods address this issue by preserving prediction equivalence over the entire input space, but this requirement leads to lower compression ratios. We propose PINE, a pruning method that provides strong guarantees within an in-distribution region. PINE preserves prediction equivalence within this region and controls the region size using a single parameter $α$ via conformal calibration. Experiments on 12 public tabular datasets show that PINE improves the compression ratio by up to 30% while preserving predictions at a comparable level to existing faithful pruning methods.
title PINE: Pruning Boosted Tree Ensembles with Conformal In-Distribution Prediction Equivalence
topic Machine Learning
url https://arxiv.org/abs/2605.28068