Saved in:
Bibliographic Details
Main Authors: Ouyang, Xiaxue, Kang, Xinlai, Li, Mengyu, Dou, Zhenxing, Yu, Jun, Meng, Cheng
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.16085
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914047396413440
author Ouyang, Xiaxue
Kang, Xinlai
Li, Mengyu
Dou, Zhenxing
Yu, Jun
Meng, Cheng
author_facet Ouyang, Xiaxue
Kang, Xinlai
Li, Mengyu
Dou, Zhenxing
Yu, Jun
Meng, Cheng
contents We consider the model-free feature screening in large-scale ultrahigh-dimensional data analysis. Existing feature screening methods often face substantial computational challenges when dealing with large sample sizes. To alleviate the computational burden, we propose a rank-based model-free sure independence screening method (CR-SIS) and its efficient variant, BanditCR-SIS. The CR-SIS method, based on Chatterjee's rank correlation, is as straightforward to implement as the sure independence screening (SIS) method based on Pearson correlation introduced by Fan and Lv(2008), but it is significantly more powerful in detecting nonlinear relationships between variables. Motivated by the multi-armed bandit (MAB) problem, we reformulate the feature screening procedure to significantly reduce the computational complexity of CR-SIS. For a predictor matrix of size n \times p, the computational cost of CR-SIS is O(nlog(n)p), while BanditCR-SIS reduces this to O(\sqrt(n)log(n)p + nlog(n)). Theoretically, we establish the sure screening property for both CR-SIS and BanditCR-SIS under mild regularity conditions. Furthermore, we demonstrate the effectiveness of our methods through extensive experimental studies on both synthetic and real-world datasets. The results highlight their superior performance compared to classical screening methods, requiring significantly less computational time.
format Preprint
id arxiv_https___arxiv_org_abs_2509_16085
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle A more efficient method for large-sample model-free feature screening via multi-armed bandits
Ouyang, Xiaxue
Kang, Xinlai
Li, Mengyu
Dou, Zhenxing
Yu, Jun
Meng, Cheng
Machine Learning
Computation
We consider the model-free feature screening in large-scale ultrahigh-dimensional data analysis. Existing feature screening methods often face substantial computational challenges when dealing with large sample sizes. To alleviate the computational burden, we propose a rank-based model-free sure independence screening method (CR-SIS) and its efficient variant, BanditCR-SIS. The CR-SIS method, based on Chatterjee's rank correlation, is as straightforward to implement as the sure independence screening (SIS) method based on Pearson correlation introduced by Fan and Lv(2008), but it is significantly more powerful in detecting nonlinear relationships between variables. Motivated by the multi-armed bandit (MAB) problem, we reformulate the feature screening procedure to significantly reduce the computational complexity of CR-SIS. For a predictor matrix of size n \times p, the computational cost of CR-SIS is O(nlog(n)p), while BanditCR-SIS reduces this to O(\sqrt(n)log(n)p + nlog(n)). Theoretically, we establish the sure screening property for both CR-SIS and BanditCR-SIS under mild regularity conditions. Furthermore, we demonstrate the effectiveness of our methods through extensive experimental studies on both synthetic and real-world datasets. The results highlight their superior performance compared to classical screening methods, requiring significantly less computational time.
title A more efficient method for large-sample model-free feature screening via multi-armed bandits
topic Machine Learning
Computation
url https://arxiv.org/abs/2509.16085