Saved in:
Bibliographic Details
Main Authors: Senaratne, Asara, Christen, Peter, Omran, Pouya, Williams, Graham
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2412.04780
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913599618809856
author Senaratne, Asara
Christen, Peter
Omran, Pouya
Williams, Graham
author_facet Senaratne, Asara
Christen, Peter
Omran, Pouya
Williams, Graham
contents Anomalies such as redundant, inconsistent, contradictory, and deficient values in a Knowledge Graph (KG) are unavoidable, as these graphs are often curated manually, or extracted using machine learning and natural language processing techniques. Therefore, anomaly detection is a task that can enhance the quality of KGs. In this paper, we propose SEKA (SEeking Knowledge graph Anomalies), an unsupervised approach for the detection of abnormal triples and entities in KGs. SEKA can help improve the correctness of a KG whilst retaining its coverage. We propose an adaption of the Path Rank Algorithm (PRA), named the Corroborative Path Rank Algorithm (CPRA), which is an efficient adaptation of PRA that is customized to detect anomalies in KGs. Furthermore, we also present TAXO (TAXOnomy of anomaly types in KGs), a taxonomy of possible anomaly types that can occur in a KG. This taxonomy provides a classification of the anomalies discovered by SEKA with an extensive discussion of possible data quality issues in a KG. We evaluate both approaches using the four real-world KGs YAGO-1, KBpedia, Wikidata, and DSKG to demonstrate the ability of SEKA and TAXO to outperform the baselines.
format Preprint
id arxiv_https___arxiv_org_abs_2412_04780
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Anomaly Detection and Classification in Knowledge Graphs
Senaratne, Asara
Christen, Peter
Omran, Pouya
Williams, Graham
Machine Learning
Anomalies such as redundant, inconsistent, contradictory, and deficient values in a Knowledge Graph (KG) are unavoidable, as these graphs are often curated manually, or extracted using machine learning and natural language processing techniques. Therefore, anomaly detection is a task that can enhance the quality of KGs. In this paper, we propose SEKA (SEeking Knowledge graph Anomalies), an unsupervised approach for the detection of abnormal triples and entities in KGs. SEKA can help improve the correctness of a KG whilst retaining its coverage. We propose an adaption of the Path Rank Algorithm (PRA), named the Corroborative Path Rank Algorithm (CPRA), which is an efficient adaptation of PRA that is customized to detect anomalies in KGs. Furthermore, we also present TAXO (TAXOnomy of anomaly types in KGs), a taxonomy of possible anomaly types that can occur in a KG. This taxonomy provides a classification of the anomalies discovered by SEKA with an extensive discussion of possible data quality issues in a KG. We evaluate both approaches using the four real-world KGs YAGO-1, KBpedia, Wikidata, and DSKG to demonstrate the ability of SEKA and TAXO to outperform the baselines.
title Anomaly Detection and Classification in Knowledge Graphs
topic Machine Learning
url https://arxiv.org/abs/2412.04780