Saved in:
Bibliographic Details
Main Authors: Jiang, Xiaobo, Deng, Yadong
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2505.01655
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908347245002752
author Jiang, Xiaobo
Deng, Yadong
author_facet Jiang, Xiaobo
Deng, Yadong
contents FB15k-237 mitigates the data leakage issue by excluding inverse and symmetric relationship triples, however, this has led to substantial performance degradation and slow improvement progress. Traditional approaches demonstrate limited effectiveness on FB15k-237, primarily because the underlying mechanism by which structural features of the dataset influence model performance remains unexplored. To bridge this gap, we systematically investigate the impact mechanism of dataset structural features on link prediction performance. Firstly, we design a structured subgraph sampling strategy that ensures connectivity while constructing subgraphs with distinct structural features. Then, through correlation and sensitivity analyses conducted across several mainstream models, we observe that the distribution of relationship categories within subgraphs significantly affects performance, followed by the size of strongly connected components. Further exploration using the LIME model clarifies the intrinsic mechanism by which relationship categories influence link prediction performance, revealing that relationship categories primarily modulate the relative importance between entity embeddings and relationship embeddings and relationship embeddings, thereby affecting link prediction outcomes. These findings provide theoretical insights for addressing performance bottlenecks on FB15k-237, while the proposed analytical framework also offers methodological guidance for future studies dealing with structurally constrained datasets.
format Preprint
id arxiv_https___arxiv_org_abs_2505_01655
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Understanding the Mechanisms Behind Structural Influences on Link Prediction: A Case Study on FB15k-237
Jiang, Xiaobo
Deng, Yadong
Signal Processing
FB15k-237 mitigates the data leakage issue by excluding inverse and symmetric relationship triples, however, this has led to substantial performance degradation and slow improvement progress. Traditional approaches demonstrate limited effectiveness on FB15k-237, primarily because the underlying mechanism by which structural features of the dataset influence model performance remains unexplored. To bridge this gap, we systematically investigate the impact mechanism of dataset structural features on link prediction performance. Firstly, we design a structured subgraph sampling strategy that ensures connectivity while constructing subgraphs with distinct structural features. Then, through correlation and sensitivity analyses conducted across several mainstream models, we observe that the distribution of relationship categories within subgraphs significantly affects performance, followed by the size of strongly connected components. Further exploration using the LIME model clarifies the intrinsic mechanism by which relationship categories influence link prediction performance, revealing that relationship categories primarily modulate the relative importance between entity embeddings and relationship embeddings and relationship embeddings, thereby affecting link prediction outcomes. These findings provide theoretical insights for addressing performance bottlenecks on FB15k-237, while the proposed analytical framework also offers methodological guidance for future studies dealing with structurally constrained datasets.
title Understanding the Mechanisms Behind Structural Influences on Link Prediction: A Case Study on FB15k-237
topic Signal Processing
url https://arxiv.org/abs/2505.01655