Saved in:
Bibliographic Details
Main Authors: Neider, Daniel, Sabellek, Leif, Schmidt, Johannes, Vehlken, Fabian, Zeume, Thomas
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2410.07708
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909498771243008
author Neider, Daniel
Sabellek, Leif
Schmidt, Johannes
Vehlken, Fabian
Zeume, Thomas
author_facet Neider, Daniel
Sabellek, Leif
Schmidt, Johannes
Vehlken, Fabian
Zeume, Thomas
contents Explaining why and how a tree $t$ structurally differs from another tree $t^\star$ is a question that is encountered throughout computer science, including in understanding tree-structured data such as XML or JSON data. In this article, we explore how to learn explanations for structural differences between pairs of trees from sample data: suppose we are given a set $\{(t_1, t_1^\star),\dots, (t_n, t_n^\star)\}$ of pairs of labelled, ordered trees; is there a small set of rules that explains the structural differences between all pairs $(t_i, t_i^\star)$? This raises two research questions: (i) what is a good notion of "rule" in this context?; and (ii) how can sets of rules explaining a data set be learned algorithmically? We explore these questions from the perspective of database theory by (1) introducing a pattern-based specification language for tree transformations; (2) exploring the computational complexity of variants of the above algorithmic problem, e.g. showing NP-hardness for very restricted variants; and (3) discussing how to solve the problem for data from CS education research using SAT solvers.
format Preprint
id arxiv_https___arxiv_org_abs_2410_07708
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Learning Tree Pattern Transformations
Neider, Daniel
Sabellek, Leif
Schmidt, Johannes
Vehlken, Fabian
Zeume, Thomas
Machine Learning
Artificial Intelligence
Computational Complexity
Databases
Explaining why and how a tree $t$ structurally differs from another tree $t^\star$ is a question that is encountered throughout computer science, including in understanding tree-structured data such as XML or JSON data. In this article, we explore how to learn explanations for structural differences between pairs of trees from sample data: suppose we are given a set $\{(t_1, t_1^\star),\dots, (t_n, t_n^\star)\}$ of pairs of labelled, ordered trees; is there a small set of rules that explains the structural differences between all pairs $(t_i, t_i^\star)$? This raises two research questions: (i) what is a good notion of "rule" in this context?; and (ii) how can sets of rules explaining a data set be learned algorithmically? We explore these questions from the perspective of database theory by (1) introducing a pattern-based specification language for tree transformations; (2) exploring the computational complexity of variants of the above algorithmic problem, e.g. showing NP-hardness for very restricted variants; and (3) discussing how to solve the problem for data from CS education research using SAT solvers.
title Learning Tree Pattern Transformations
topic Machine Learning
Artificial Intelligence
Computational Complexity
Databases
url https://arxiv.org/abs/2410.07708