Saved in:
Bibliographic Details
Main Authors: Cao, Yang, Lin, Hangyu, Sun, Xinwei, Yao, Yuan
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.01732
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918135509024768
author Cao, Yang
Lin, Hangyu
Sun, Xinwei
Yao, Yuan
author_facet Cao, Yang
Lin, Hangyu
Sun, Xinwei
Yao, Yuan
contents Controlling the False Discovery Rate (FDR) is critical for reproducible variable selection, especially given the prevalence of complex predictive modeling. The recent Split Knockoff method, an extension of the canonical Knockoffs framework, offers finite-sample FDR control for selecting sparse transformations but is limited to linear models with fixed designs. Extending this framework to random designs, which would accommodate a much broader range of models, is challenged by the fundamental difficulty of reconciling a random covariate design with a deterministic linear transformation. To bridge this gap, we introduce Model-X Split Knockoffs. Our method achieves robust FDR control for transformation selection in random designs by introducing a novel auxiliary randomized design. This key innovation effectively mediates the interaction between the random design and the deterministic transformation, enabling the construction of valid knockoffs. Like the classical Model-X framework, our approach provides provable, finite-sample FDR control under known or accurately estimated covariate distributions, regardless of the response's conditional distribution. Importantly, it guarantees at least the same, and often superior, selection power as standard Model-X Knockoffs when both are applicable. Empirical studies, including simulations and real-world applications to Alzheimer's disease imaging and university ranking analysis, demonstrate robust FDR control and improved statistical power.
format Preprint
id arxiv_https___arxiv_org_abs_2507_01732
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Gold after Randomized Sand: Model-X Split Knockoffs for Controlled Transformation Selection
Cao, Yang
Lin, Hangyu
Sun, Xinwei
Yao, Yuan
Methodology
Controlling the False Discovery Rate (FDR) is critical for reproducible variable selection, especially given the prevalence of complex predictive modeling. The recent Split Knockoff method, an extension of the canonical Knockoffs framework, offers finite-sample FDR control for selecting sparse transformations but is limited to linear models with fixed designs. Extending this framework to random designs, which would accommodate a much broader range of models, is challenged by the fundamental difficulty of reconciling a random covariate design with a deterministic linear transformation. To bridge this gap, we introduce Model-X Split Knockoffs. Our method achieves robust FDR control for transformation selection in random designs by introducing a novel auxiliary randomized design. This key innovation effectively mediates the interaction between the random design and the deterministic transformation, enabling the construction of valid knockoffs. Like the classical Model-X framework, our approach provides provable, finite-sample FDR control under known or accurately estimated covariate distributions, regardless of the response's conditional distribution. Importantly, it guarantees at least the same, and often superior, selection power as standard Model-X Knockoffs when both are applicable. Empirical studies, including simulations and real-world applications to Alzheimer's disease imaging and university ranking analysis, demonstrate robust FDR control and improved statistical power.
title Gold after Randomized Sand: Model-X Split Knockoffs for Controlled Transformation Selection
topic Methodology
url https://arxiv.org/abs/2507.01732