Saved in:
Bibliographic Details
Main Authors: Chen, Lin, Xu, Fengli, Li, Nian, Han, Zhenyu, Wang, Meng, Li, Yong, Hui, Pan
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.11518
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916296871903232
author Chen, Lin
Xu, Fengli
Li, Nian
Han, Zhenyu
Wang, Meng
Li, Yong
Hui, Pan
author_facet Chen, Lin
Xu, Fengli
Li, Nian
Han, Zhenyu
Wang, Meng
Li, Yong
Hui, Pan
contents Heterogeneous information networks (HIN) have gained increasing popularity in recent years for capturing complex relations between diverse types of nodes. Meta-structures are proposed as a useful tool to identify the important patterns in HINs, but hand-crafted meta-structures pose significant challenges for scaling up, drawing wide research attention towards developing automatic search algorithms. Previous efforts primarily focused on searching for meta-structures with good empirical performance, overlooking the importance of human comprehensibility and generalizability. To address this challenge, we draw inspiration from the emergent reasoning abilities of large language models (LLMs). We propose ReStruct, a meta-structure search framework that integrates LLM reasoning into the evolutionary procedure. ReStruct uses a grammar translator to encode the meta-structures into natural language sentences, and leverages the reasoning power of LLMs to evaluate their semantic feasibility. Besides, ReStruct also employs performance-oriented evolutionary operations. These two competing forces allow ReStruct to jointly optimize the semantic explainability and empirical performance of meta-structures. Furthermore, ReStruct contains a differential LLM explainer to generate and refine natural language explanations for the discovered meta-structures by reasoning through the search history. Experiments on eight representative HIN datasets demonstrate that ReStruct achieves state-of-the-art performance in both recommendation and node classification tasks. Moreover, a survey study involving 73 graduate students shows that the discovered meta-structures and generated explanations by ReStruct are substantially more comprehensible. Our code and questionnaire are available at https://github.com/LinChen-65/ReStruct.
format Preprint
id arxiv_https___arxiv_org_abs_2402_11518
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Large Language Model-driven Meta-structure Discovery in Heterogeneous Information Network
Chen, Lin
Xu, Fengli
Li, Nian
Han, Zhenyu
Wang, Meng
Li, Yong
Hui, Pan
Machine Learning
Computation and Language
Heterogeneous information networks (HIN) have gained increasing popularity in recent years for capturing complex relations between diverse types of nodes. Meta-structures are proposed as a useful tool to identify the important patterns in HINs, but hand-crafted meta-structures pose significant challenges for scaling up, drawing wide research attention towards developing automatic search algorithms. Previous efforts primarily focused on searching for meta-structures with good empirical performance, overlooking the importance of human comprehensibility and generalizability. To address this challenge, we draw inspiration from the emergent reasoning abilities of large language models (LLMs). We propose ReStruct, a meta-structure search framework that integrates LLM reasoning into the evolutionary procedure. ReStruct uses a grammar translator to encode the meta-structures into natural language sentences, and leverages the reasoning power of LLMs to evaluate their semantic feasibility. Besides, ReStruct also employs performance-oriented evolutionary operations. These two competing forces allow ReStruct to jointly optimize the semantic explainability and empirical performance of meta-structures. Furthermore, ReStruct contains a differential LLM explainer to generate and refine natural language explanations for the discovered meta-structures by reasoning through the search history. Experiments on eight representative HIN datasets demonstrate that ReStruct achieves state-of-the-art performance in both recommendation and node classification tasks. Moreover, a survey study involving 73 graduate students shows that the discovered meta-structures and generated explanations by ReStruct are substantially more comprehensible. Our code and questionnaire are available at https://github.com/LinChen-65/ReStruct.
title Large Language Model-driven Meta-structure Discovery in Heterogeneous Information Network
topic Machine Learning
Computation and Language
url https://arxiv.org/abs/2402.11518