Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Xu, Yuhao, Liu, Xinqi, Duan, Keyu, Fang, Yi, Chuang, Yu-Neng, Zha, Daochen, Tan, Qiaoyu
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2406.08310
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911917845512192
author	Xu, Yuhao Liu, Xinqi Duan, Keyu Fang, Yi Chuang, Yu-Neng Zha, Daochen Tan, Qiaoyu
author_facet	Xu, Yuhao Liu, Xinqi Duan, Keyu Fang, Yi Chuang, Yu-Neng Zha, Daochen Tan, Qiaoyu
contents	Foundation Models (FMs) serve as a general class for the development of artificial intelligence systems, offering broad potential for generalization across a spectrum of downstream tasks. Despite extensive research into self-supervised learning as the cornerstone of FMs, several outstanding issues persist in Graph Foundation Models that rely on graph self-supervised learning, namely: 1) Homogenization. The extent of generalization capability on downstream tasks remains unclear. 2) Scalability. It is unknown how effectively these models can scale to large datasets. 3) Efficiency. The training time and memory usage of these models require evaluation. 4) Training Stop Criteria. Determining the optimal stopping strategy for pre-training across multiple tasks to maximize performance on downstream tasks. To address these questions, we have constructed a rigorous benchmark that thoroughly analyzes and studies the generalization and scalability of self-supervised Graph Neural Network (GNN) models. Regarding generalization, we have implemented and compared the performance of various self-supervised GNN models, trained to generate node representations, across tasks such as node classification, link prediction, and node clustering. For scalability, we have compared the performance of various models after training using full-batch and mini-batch strategies. Additionally, we have assessed the training efficiency of these models by conducting experiments to test their GPU memory usage and throughput. Through these experiments, we aim to provide insights to motivate future research. The code for this benchmark is publicly available at https://github.com/NYUSHCS/GraphFM.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_08310
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	GraphFM: A Comprehensive Benchmark for Graph Foundation Model Xu, Yuhao Liu, Xinqi Duan, Keyu Fang, Yi Chuang, Yu-Neng Zha, Daochen Tan, Qiaoyu Machine Learning Foundation Models (FMs) serve as a general class for the development of artificial intelligence systems, offering broad potential for generalization across a spectrum of downstream tasks. Despite extensive research into self-supervised learning as the cornerstone of FMs, several outstanding issues persist in Graph Foundation Models that rely on graph self-supervised learning, namely: 1) Homogenization. The extent of generalization capability on downstream tasks remains unclear. 2) Scalability. It is unknown how effectively these models can scale to large datasets. 3) Efficiency. The training time and memory usage of these models require evaluation. 4) Training Stop Criteria. Determining the optimal stopping strategy for pre-training across multiple tasks to maximize performance on downstream tasks. To address these questions, we have constructed a rigorous benchmark that thoroughly analyzes and studies the generalization and scalability of self-supervised Graph Neural Network (GNN) models. Regarding generalization, we have implemented and compared the performance of various self-supervised GNN models, trained to generate node representations, across tasks such as node classification, link prediction, and node clustering. For scalability, we have compared the performance of various models after training using full-batch and mini-batch strategies. Additionally, we have assessed the training efficiency of these models by conducting experiments to test their GPU memory usage and throughput. Through these experiments, we aim to provide insights to motivate future research. The code for this benchmark is publicly available at https://github.com/NYUSHCS/GraphFM.
title	GraphFM: A Comprehensive Benchmark for Graph Foundation Model
topic	Machine Learning
url	https://arxiv.org/abs/2406.08310

Similar Items