Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Gao, Zhangyang, Dong, Daize, Tan, Cheng, Xia, Jun, Hu, Bozhen, Li, Stan Z.
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence Social and Information Networks
Online Access:	https://arxiv.org/abs/2402.02464
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916264437350400
author	Gao, Zhangyang Dong, Daize Tan, Cheng Xia, Jun Hu, Bozhen Li, Stan Z.
author_facet	Gao, Zhangyang Dong, Daize Tan, Cheng Xia, Jun Hu, Bozhen Li, Stan Z.
contents	Can we model Non-Euclidean graphs as pure language or even Euclidean vectors while retaining their inherent information? The Non-Euclidean property have posed a long term challenge in graph modeling. Despite recent graph neural networks and graph transformers efforts encoding graphs as Euclidean vectors, recovering the original graph from vectors remains a challenge. In this paper, we introduce GraphsGPT, featuring an Graph2Seq encoder that transforms Non-Euclidean graphs into learnable Graph Words in the Euclidean space, along with a GraphGPT decoder that reconstructs the original graph from Graph Words to ensure information equivalence. We pretrain GraphsGPT on $100$M molecules and yield some interesting findings: (1) The pretrained Graph2Seq excels in graph representation learning, achieving state-of-the-art results on $8/9$ graph classification and regression tasks. (2) The pretrained GraphGPT serves as a strong graph generator, demonstrated by its strong ability to perform both few-shot and conditional graph generation. (3) Graph2Seq+GraphGPT enables effective graph mixup in the Euclidean space, overcoming previously known Non-Euclidean challenges. (4) The edge-centric pretraining framework GraphsGPT demonstrates its efficacy in graph domain tasks, excelling in both representation and generation. Code is available at \href{https://github.com/A4Bio/GraphsGPT}{GitHub}.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_02464
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer Gao, Zhangyang Dong, Daize Tan, Cheng Xia, Jun Hu, Bozhen Li, Stan Z. Machine Learning Artificial Intelligence Social and Information Networks Can we model Non-Euclidean graphs as pure language or even Euclidean vectors while retaining their inherent information? The Non-Euclidean property have posed a long term challenge in graph modeling. Despite recent graph neural networks and graph transformers efforts encoding graphs as Euclidean vectors, recovering the original graph from vectors remains a challenge. In this paper, we introduce GraphsGPT, featuring an Graph2Seq encoder that transforms Non-Euclidean graphs into learnable Graph Words in the Euclidean space, along with a GraphGPT decoder that reconstructs the original graph from Graph Words to ensure information equivalence. We pretrain GraphsGPT on $100$M molecules and yield some interesting findings: (1) The pretrained Graph2Seq excels in graph representation learning, achieving state-of-the-art results on $8/9$ graph classification and regression tasks. (2) The pretrained GraphGPT serves as a strong graph generator, demonstrated by its strong ability to perform both few-shot and conditional graph generation. (3) Graph2Seq+GraphGPT enables effective graph mixup in the Euclidean space, overcoming previously known Non-Euclidean challenges. (4) The edge-centric pretraining framework GraphsGPT demonstrates its efficacy in graph domain tasks, excelling in both representation and generation. Code is available at \href{https://github.com/A4Bio/GraphsGPT}{GitHub}.
title	A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer
topic	Machine Learning Artificial Intelligence Social and Information Networks
url	https://arxiv.org/abs/2402.02464

Similar Items