Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.00905 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Table of Contents:
- The use of coarse demographic adjustments in clinical equations has been increasingly scrutinized. In particular, adjustments for race have sparked significant debate with several medical professional societies recommending race-neutral equations in recent years. However, current approaches to remove race from clinical equations do not address the underlying causes of observed differences. Here, we present ARC (Approach for identifying pRoxies of demographic Correction), a framework to identify explanatory factors of group-level differences, which may inform the development of more accurate and precise clinical equations. We apply ARC to spirometry tests across two observational cohorts, CDC NHANES and UK Biobank, comprising 159,893 participants. Cross-sectional sociodemographic or exposure measures did not explain differences in reference lung function across race groups beyond those already explained by age, sex, and height. By contrast, sitting height accounted for up to 26% of the remaining differences in lung volumes between healthy Black and White adults. We then demonstrate how pulmonary function test (PFT) reference equations can incorporate these factors in a new set of equations called $ARC_{PFT}$, surpassing the predictive performance of the race-neutral GLI-Global equation recommended by major pulmonary societies. When compared to GLI-Global, inclusion of sitting height and waist circumference in $ARC_{PFT}$ decreased mean absolute error by 13% among Black participants in the UK Biobank and by 24% in NHANES. $ARC_{PFT}$ also had reduced vulnerability to domain shift compared to race-based methods, with mean absolute error 19.3% and 35.6% lower than race-stratified models in out-of-sample Asian and Hispanic populations, respectively. This approach provides a path for understanding the proxies of imprecise demographic adjustments and developing personalized clinical equations.