Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhou, Yuhang, Zhao, Zhuokai, Li, Ke, Evmorfos, Spilios, Demirci, Gökalp, Wang, Mingyi, Liu, Qiao, Wang, Qifei, Li, Serena, Li, Weiwei, Wang, Tingting, Gao, Mingze, Zhou, Gedi, Kumar, Abhishek, Fan, Xiangjun, Zhang, Lizhu, Liu, Jiayi
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2603.24979
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915891777634304
author	Zhou, Yuhang Zhao, Zhuokai Li, Ke Evmorfos, Spilios Demirci, Gökalp Wang, Mingyi Liu, Qiao Wang, Qifei Li, Serena Li, Weiwei Wang, Tingting Gao, Mingze Zhou, Gedi Kumar, Abhishek Fan, Xiangjun Zhang, Lizhu Liu, Jiayi
author_facet	Zhou, Yuhang Zhao, Zhuokai Li, Ke Evmorfos, Spilios Demirci, Gökalp Wang, Mingyi Liu, Qiao Wang, Qifei Li, Serena Li, Weiwei Wang, Tingting Gao, Mingze Zhou, Gedi Kumar, Abhishek Fan, Xiangjun Zhang, Lizhu Liu, Jiayi
contents	Feature selection is a crucial step in large-scale industrial machine learning systems, directly affecting model accuracy, efficiency, and maintainability. Traditional feature selection methods rely on labeled data and statistical heuristics, making them difficult to apply in production environments where labeled data are limited and multiple operational constraints must be satisfied. To address this, we propose Model Feature Agent (MoFA), a model-driven framework that performs sequential, reasoning-based feature selection using both semantic and quantitative feature information. MoFA incorporates feature definitions, importance scores, correlations, and metadata (e.g., feature groups or types) into structured prompts and selects features through interpretable, constraint-aware reasoning. We evaluate MoFA in three real-world industrial applications: (1) True Interest and Time-Worthiness Prediction, where it improves accuracy while reducing feature group complexity, (2) Value Model Enhancement, where it discovers high-order interaction terms that yield substantial engagement gains in online experiments, and (3) Notification Behavior Prediction, where it selects compact, high-value feature subsets that improve both model accuracy and inference efficiency. Together, these results demonstrate the practicality and effectiveness of LLM-based reasoning for feature selection in real production systems.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_24979
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	LLM-Driven Reasoning for Constraint-Aware Feature Selection in Industrial Systems Zhou, Yuhang Zhao, Zhuokai Li, Ke Evmorfos, Spilios Demirci, Gökalp Wang, Mingyi Liu, Qiao Wang, Qifei Li, Serena Li, Weiwei Wang, Tingting Gao, Mingze Zhou, Gedi Kumar, Abhishek Fan, Xiangjun Zhang, Lizhu Liu, Jiayi Computation and Language Feature selection is a crucial step in large-scale industrial machine learning systems, directly affecting model accuracy, efficiency, and maintainability. Traditional feature selection methods rely on labeled data and statistical heuristics, making them difficult to apply in production environments where labeled data are limited and multiple operational constraints must be satisfied. To address this, we propose Model Feature Agent (MoFA), a model-driven framework that performs sequential, reasoning-based feature selection using both semantic and quantitative feature information. MoFA incorporates feature definitions, importance scores, correlations, and metadata (e.g., feature groups or types) into structured prompts and selects features through interpretable, constraint-aware reasoning. We evaluate MoFA in three real-world industrial applications: (1) True Interest and Time-Worthiness Prediction, where it improves accuracy while reducing feature group complexity, (2) Value Model Enhancement, where it discovers high-order interaction terms that yield substantial engagement gains in online experiments, and (3) Notification Behavior Prediction, where it selects compact, high-value feature subsets that improve both model accuracy and inference efficiency. Together, these results demonstrate the practicality and effectiveness of LLM-based reasoning for feature selection in real production systems.
title	LLM-Driven Reasoning for Constraint-Aware Feature Selection in Industrial Systems
topic	Computation and Language
url	https://arxiv.org/abs/2603.24979

Similar Items