Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.07708 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866909498771243008 |
|---|---|
| author | Neider, Daniel Sabellek, Leif Schmidt, Johannes Vehlken, Fabian Zeume, Thomas |
| author_facet | Neider, Daniel Sabellek, Leif Schmidt, Johannes Vehlken, Fabian Zeume, Thomas |
| contents | Explaining why and how a tree $t$ structurally differs from another tree $t^\star$ is a question that is encountered throughout computer science, including in understanding tree-structured data such as XML or JSON data. In this article, we explore how to learn explanations for structural differences between pairs of trees from sample data: suppose we are given a set $\{(t_1, t_1^\star),\dots, (t_n, t_n^\star)\}$ of pairs of labelled, ordered trees; is there a small set of rules that explains the structural differences between all pairs $(t_i, t_i^\star)$? This raises two research questions: (i) what is a good notion of "rule" in this context?; and (ii) how can sets of rules explaining a data set be learned algorithmically?
We explore these questions from the perspective of database theory by (1) introducing a pattern-based specification language for tree transformations; (2) exploring the computational complexity of variants of the above algorithmic problem, e.g. showing NP-hardness for very restricted variants; and (3) discussing how to solve the problem for data from CS education research using SAT solvers. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2410_07708 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | Learning Tree Pattern Transformations Neider, Daniel Sabellek, Leif Schmidt, Johannes Vehlken, Fabian Zeume, Thomas Machine Learning Artificial Intelligence Computational Complexity Databases Explaining why and how a tree $t$ structurally differs from another tree $t^\star$ is a question that is encountered throughout computer science, including in understanding tree-structured data such as XML or JSON data. In this article, we explore how to learn explanations for structural differences between pairs of trees from sample data: suppose we are given a set $\{(t_1, t_1^\star),\dots, (t_n, t_n^\star)\}$ of pairs of labelled, ordered trees; is there a small set of rules that explains the structural differences between all pairs $(t_i, t_i^\star)$? This raises two research questions: (i) what is a good notion of "rule" in this context?; and (ii) how can sets of rules explaining a data set be learned algorithmically? We explore these questions from the perspective of database theory by (1) introducing a pattern-based specification language for tree transformations; (2) exploring the computational complexity of variants of the above algorithmic problem, e.g. showing NP-hardness for very restricted variants; and (3) discussing how to solve the problem for data from CS education research using SAT solvers. |
| title | Learning Tree Pattern Transformations |
| topic | Machine Learning Artificial Intelligence Computational Complexity Databases |
| url | https://arxiv.org/abs/2410.07708 |