Saved in:
Bibliographic Details
Main Authors: Bian, Yuan, Yi, Grace Y., He, Wenqing
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.21276
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912914015780864
author Bian, Yuan
Yi, Grace Y.
He, Wenqing
author_facet Bian, Yuan
Yi, Grace Y.
He, Wenqing
contents Boosting has emerged as a useful machine learning technique over the past three decades, attracting increased attention. Most advancements in this area, however, have primarily focused on numerical implementation procedures, often lacking rigorous theoretical justifications. Moreover, these approaches are generally designed for datasets with fully observed data, and their validity can be compromised by the presence of missing observations. In this paper, we employ semiparametric estimation approaches to develop boosting prediction methods for data with missing responses. We explore two strategies for adjusting the loss functions to account for missingness effects. The proposed methods are implemented using a functional gradient descent algorithm, and their theoretical properties, including algorithm convergence and estimator consistency, are rigorously established. Numerical studies demonstrate that the proposed methods perform well in finite sample settings.
format Preprint
id arxiv_https___arxiv_org_abs_2502_21276
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Boosting prediction with data missing not at random
Bian, Yuan
Yi, Grace Y.
He, Wenqing
Methodology
Boosting has emerged as a useful machine learning technique over the past three decades, attracting increased attention. Most advancements in this area, however, have primarily focused on numerical implementation procedures, often lacking rigorous theoretical justifications. Moreover, these approaches are generally designed for datasets with fully observed data, and their validity can be compromised by the presence of missing observations. In this paper, we employ semiparametric estimation approaches to develop boosting prediction methods for data with missing responses. We explore two strategies for adjusting the loss functions to account for missingness effects. The proposed methods are implemented using a functional gradient descent algorithm, and their theoretical properties, including algorithm convergence and estimator consistency, are rigorously established. Numerical studies demonstrate that the proposed methods perform well in finite sample settings.
title Boosting prediction with data missing not at random
topic Methodology
url https://arxiv.org/abs/2502.21276