Saved in:
Bibliographic Details
Main Authors: Zhou, Yuhang, Zhao, Zhuokai, Li, Ke, Evmorfos, Spilios, Demirci, Gökalp, Wang, Mingyi, Liu, Qiao, Wang, Qifei, Li, Serena, Li, Weiwei, Wang, Tingting, Gao, Mingze, Zhou, Gedi, Kumar, Abhishek, Fan, Xiangjun, Zhang, Lizhu, Liu, Jiayi
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.24979
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915891777634304
author Zhou, Yuhang
Zhao, Zhuokai
Li, Ke
Evmorfos, Spilios
Demirci, Gökalp
Wang, Mingyi
Liu, Qiao
Wang, Qifei
Li, Serena
Li, Weiwei
Wang, Tingting
Gao, Mingze
Zhou, Gedi
Kumar, Abhishek
Fan, Xiangjun
Zhang, Lizhu
Liu, Jiayi
author_facet Zhou, Yuhang
Zhao, Zhuokai
Li, Ke
Evmorfos, Spilios
Demirci, Gökalp
Wang, Mingyi
Liu, Qiao
Wang, Qifei
Li, Serena
Li, Weiwei
Wang, Tingting
Gao, Mingze
Zhou, Gedi
Kumar, Abhishek
Fan, Xiangjun
Zhang, Lizhu
Liu, Jiayi
contents Feature selection is a crucial step in large-scale industrial machine learning systems, directly affecting model accuracy, efficiency, and maintainability. Traditional feature selection methods rely on labeled data and statistical heuristics, making them difficult to apply in production environments where labeled data are limited and multiple operational constraints must be satisfied. To address this, we propose Model Feature Agent (MoFA), a model-driven framework that performs sequential, reasoning-based feature selection using both semantic and quantitative feature information. MoFA incorporates feature definitions, importance scores, correlations, and metadata (e.g., feature groups or types) into structured prompts and selects features through interpretable, constraint-aware reasoning. We evaluate MoFA in three real-world industrial applications: (1) True Interest and Time-Worthiness Prediction, where it improves accuracy while reducing feature group complexity, (2) Value Model Enhancement, where it discovers high-order interaction terms that yield substantial engagement gains in online experiments, and (3) Notification Behavior Prediction, where it selects compact, high-value feature subsets that improve both model accuracy and inference efficiency. Together, these results demonstrate the practicality and effectiveness of LLM-based reasoning for feature selection in real production systems.
format Preprint
id arxiv_https___arxiv_org_abs_2603_24979
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle LLM-Driven Reasoning for Constraint-Aware Feature Selection in Industrial Systems
Zhou, Yuhang
Zhao, Zhuokai
Li, Ke
Evmorfos, Spilios
Demirci, Gökalp
Wang, Mingyi
Liu, Qiao
Wang, Qifei
Li, Serena
Li, Weiwei
Wang, Tingting
Gao, Mingze
Zhou, Gedi
Kumar, Abhishek
Fan, Xiangjun
Zhang, Lizhu
Liu, Jiayi
Computation and Language
Feature selection is a crucial step in large-scale industrial machine learning systems, directly affecting model accuracy, efficiency, and maintainability. Traditional feature selection methods rely on labeled data and statistical heuristics, making them difficult to apply in production environments where labeled data are limited and multiple operational constraints must be satisfied. To address this, we propose Model Feature Agent (MoFA), a model-driven framework that performs sequential, reasoning-based feature selection using both semantic and quantitative feature information. MoFA incorporates feature definitions, importance scores, correlations, and metadata (e.g., feature groups or types) into structured prompts and selects features through interpretable, constraint-aware reasoning. We evaluate MoFA in three real-world industrial applications: (1) True Interest and Time-Worthiness Prediction, where it improves accuracy while reducing feature group complexity, (2) Value Model Enhancement, where it discovers high-order interaction terms that yield substantial engagement gains in online experiments, and (3) Notification Behavior Prediction, where it selects compact, high-value feature subsets that improve both model accuracy and inference efficiency. Together, these results demonstrate the practicality and effectiveness of LLM-based reasoning for feature selection in real production systems.
title LLM-Driven Reasoning for Constraint-Aware Feature Selection in Industrial Systems
topic Computation and Language
url https://arxiv.org/abs/2603.24979