Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Petersen, Jonas, Mazzoleni, Camilla, Lombardi, Gian-Alessandro, Martelli, Federico, Maggioni, Riccardo
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.02834
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911668568588288
author	Petersen, Jonas Mazzoleni, Camilla Lombardi, Gian-Alessandro Martelli, Federico Maggioni, Riccardo
author_facet	Petersen, Jonas Mazzoleni, Camilla Lombardi, Gian-Alessandro Martelli, Federico Maggioni, Riccardo
contents	What structural inductive bias helps transformers reason over knowledge graphs? Through controlled ablations of a minimal transformer modification with four independently removable components (sparse adjacency masking, edge-type biases, query scaling, value gating), we isolate which structural signals drive multi-hop reasoning. Our finding is sharp: sparse adjacency masking alone accounts for the dominant share of improvement over unmasked transformers (+72.5pp on 3-hop MetaQA, +45.5pp on WebQSP, +53.9pp on CWQ), while learned relation parameters add only modest refinement and can actively hurt without structural guidance. A zero-shot experiment provides architecturally independent corroboration: masking-based attention degrades 4.0x less than relation-specific weights when edge types are held out. The useful inductive bias for multi-hop KGQA is predominantly topological, not relational.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_02834
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	What Structural Inductive Bias Helps Transformers Reason Over Knowledge Graphs? A Study with Tabula RASA Petersen, Jonas Mazzoleni, Camilla Lombardi, Gian-Alessandro Martelli, Federico Maggioni, Riccardo Machine Learning Artificial Intelligence What structural inductive bias helps transformers reason over knowledge graphs? Through controlled ablations of a minimal transformer modification with four independently removable components (sparse adjacency masking, edge-type biases, query scaling, value gating), we isolate which structural signals drive multi-hop reasoning. Our finding is sharp: sparse adjacency masking alone accounts for the dominant share of improvement over unmasked transformers (+72.5pp on 3-hop MetaQA, +45.5pp on WebQSP, +53.9pp on CWQ), while learned relation parameters add only modest refinement and can actively hurt without structural guidance. A zero-shot experiment provides architecturally independent corroboration: masking-based attention degrades 4.0x less than relation-specific weights when edge types are held out. The useful inductive bias for multi-hop KGQA is predominantly topological, not relational.
title	What Structural Inductive Bias Helps Transformers Reason Over Knowledge Graphs? A Study with Tabula RASA
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2602.02834

Similar Items