Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Li, Bingxuan, Shi, Pengyi, Ward, Amy
Format:	Preprint
Publié:	2024
Sujets:	Machine Learning Computation and Language
Accès en ligne:	https://arxiv.org/abs/2410.04347
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866912060665757696
author	Li, Bingxuan Shi, Pengyi Ward, Amy
author_facet	Li, Bingxuan Shi, Pengyi Ward, Amy
contents	Predictive modeling often faces challenges due to limited data availability and quality, especially in domains where collected features are weakly correlated with outcomes and where additional feature collection is constrained by ethical or practical difficulties. Traditional machine learning (ML) models struggle to incorporate unobserved yet critical factors. In this work, we introduce an effective approach to formulate latent feature mining as text-to-text propositional logical reasoning. We propose FLAME (Faithful Latent Feature Mining for Predictive Model Enhancement), a framework that leverages large language models (LLMs) to augment observed features with latent features and enhance the predictive power of ML models in downstream tasks. Our framework is generalizable across various domains with necessary domain-specific adaptation, as it is designed to incorporate contextual information unique to each area, ensuring effective transfer to different areas facing similar data availability challenges. We validate our framework with two case studies: (1) the criminal justice system, a domain characterized by limited and ethically challenging data collection; (2) the healthcare domain, where patient privacy concerns and the complexity of medical data limit comprehensive feature collection. Our results show that inferred latent features align well with ground truth labels and significantly enhance the downstream classifier.
format	Preprint
id	arxiv_https___arxiv_org_abs_2410_04347
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Latent Feature Mining for Predictive Model Enhancement with Large Language Models Li, Bingxuan Shi, Pengyi Ward, Amy Machine Learning Computation and Language Predictive modeling often faces challenges due to limited data availability and quality, especially in domains where collected features are weakly correlated with outcomes and where additional feature collection is constrained by ethical or practical difficulties. Traditional machine learning (ML) models struggle to incorporate unobserved yet critical factors. In this work, we introduce an effective approach to formulate latent feature mining as text-to-text propositional logical reasoning. We propose FLAME (Faithful Latent Feature Mining for Predictive Model Enhancement), a framework that leverages large language models (LLMs) to augment observed features with latent features and enhance the predictive power of ML models in downstream tasks. Our framework is generalizable across various domains with necessary domain-specific adaptation, as it is designed to incorporate contextual information unique to each area, ensuring effective transfer to different areas facing similar data availability challenges. We validate our framework with two case studies: (1) the criminal justice system, a domain characterized by limited and ethically challenging data collection; (2) the healthcare domain, where patient privacy concerns and the complexity of medical data limit comprehensive feature collection. Our results show that inferred latent features align well with ground truth labels and significantly enhance the downstream classifier.
title	Latent Feature Mining for Predictive Model Enhancement with Large Language Models
topic	Machine Learning Computation and Language
url	https://arxiv.org/abs/2410.04347

Documents similaires