Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Samoaa, Peter, Vukojevic, Marcus, Chehreghani, Morteza Haghir, Longa, Antonio
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2505.23875
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917451634049024
author	Samoaa, Peter Vukojevic, Marcus Chehreghani, Morteza Haghir Longa, Antonio
author_facet	Samoaa, Peter Vukojevic, Marcus Chehreghani, Morteza Haghir Longa, Antonio
contents	Graph-level regression underpins many real-world applications, yet public benchmarks remain heavily skewed toward molecular graphs and citation networks. This limited diversity hinders progress on models that must generalize across both homogeneous and heterogeneous graph structures. We introduce RelSC, a new graph-regression dataset built from program graphs that combine syntactic and semantic information extracted from source code. Each graph is labelled with the execution-time cost of the corresponding program, providing a continuous target variable that differs markedly from those found in existing benchmarks. RelSC is released in two complementary variants. RelSC-H supplies rich node features under a single (homogeneous) edge type, while RelSC-M preserves the original multi-relational structure, connecting nodes through multiple edge types that encode distinct semantic relationships. Together, these variants let researchers probe how representation choice influences model behaviour. We evaluate a diverse set of graph neural network architectures on both variants of RelSC. The results reveal consistent performance differences between the homogeneous and multi-relational settings, emphasising the importance of structural representation. These findings demonstrate RelSC's value as a challenging and versatile benchmark for advancing graph regression methods.
format	Preprint
id	arxiv_https___arxiv_org_abs_2505_23875
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	A Benchmark Dataset for Graph Regression with Homogeneous and Multi-Relational Variants Samoaa, Peter Vukojevic, Marcus Chehreghani, Morteza Haghir Longa, Antonio Machine Learning Artificial Intelligence Graph-level regression underpins many real-world applications, yet public benchmarks remain heavily skewed toward molecular graphs and citation networks. This limited diversity hinders progress on models that must generalize across both homogeneous and heterogeneous graph structures. We introduce RelSC, a new graph-regression dataset built from program graphs that combine syntactic and semantic information extracted from source code. Each graph is labelled with the execution-time cost of the corresponding program, providing a continuous target variable that differs markedly from those found in existing benchmarks. RelSC is released in two complementary variants. RelSC-H supplies rich node features under a single (homogeneous) edge type, while RelSC-M preserves the original multi-relational structure, connecting nodes through multiple edge types that encode distinct semantic relationships. Together, these variants let researchers probe how representation choice influences model behaviour. We evaluate a diverse set of graph neural network architectures on both variants of RelSC. The results reveal consistent performance differences between the homogeneous and multi-relational settings, emphasising the importance of structural representation. These findings demonstrate RelSC's value as a challenging and versatile benchmark for advancing graph regression methods.
title	A Benchmark Dataset for Graph Regression with Homogeneous and Multi-Relational Variants
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2505.23875

Similar Items