Saved in:
Bibliographic Details
Main Authors: Chi, Hongliang, Qi, Cong, Wang, Suhang, Ma, Yao
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.02321
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910318524891136
author Chi, Hongliang
Qi, Cong
Wang, Suhang
Ma, Yao
author_facet Chi, Hongliang
Qi, Cong
Wang, Suhang
Ma, Yao
contents Graph Neural Networks (GNNs) have seen significant success in tasks such as node classification, largely contingent upon the availability of sufficient labeled nodes. Yet, the excessive cost of labeling large-scale graphs led to a focus on active learning on graphs, which aims for effective data selection to maximize downstream model performance. Notably, most existing methods assume reliable graph topology, while real-world scenarios often present noisy graphs. Given this, designing a successful active learning framework for noisy graphs is highly needed but challenging, as selecting data for labeling and obtaining a clean graph are two tasks naturally interdependent: selecting high-quality data requires clean graph structure while cleaning noisy graph structure requires sufficient labeled data. Considering the complexity mentioned above, we propose an active learning framework, GALClean, which has been specifically designed to adopt an iterative approach for conducting both data selection and graph purification simultaneously with best information learned from the prior iteration. Importantly, we summarize GALClean as an instance of the Expectation-Maximization algorithm, which provides a theoretical understanding of its design and mechanisms. This theory naturally leads to an enhanced version, GALClean+. Extensive experiments have demonstrated the effectiveness and robustness of our proposed method across various types and levels of noisy graphs.
format Preprint
id arxiv_https___arxiv_org_abs_2402_02321
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Active Learning for Graphs with Noisy Structures
Chi, Hongliang
Qi, Cong
Wang, Suhang
Ma, Yao
Machine Learning
Graph Neural Networks (GNNs) have seen significant success in tasks such as node classification, largely contingent upon the availability of sufficient labeled nodes. Yet, the excessive cost of labeling large-scale graphs led to a focus on active learning on graphs, which aims for effective data selection to maximize downstream model performance. Notably, most existing methods assume reliable graph topology, while real-world scenarios often present noisy graphs. Given this, designing a successful active learning framework for noisy graphs is highly needed but challenging, as selecting data for labeling and obtaining a clean graph are two tasks naturally interdependent: selecting high-quality data requires clean graph structure while cleaning noisy graph structure requires sufficient labeled data. Considering the complexity mentioned above, we propose an active learning framework, GALClean, which has been specifically designed to adopt an iterative approach for conducting both data selection and graph purification simultaneously with best information learned from the prior iteration. Importantly, we summarize GALClean as an instance of the Expectation-Maximization algorithm, which provides a theoretical understanding of its design and mechanisms. This theory naturally leads to an enhanced version, GALClean+. Extensive experiments have demonstrated the effectiveness and robustness of our proposed method across various types and levels of noisy graphs.
title Active Learning for Graphs with Noisy Structures
topic Machine Learning
url https://arxiv.org/abs/2402.02321