Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Loreti, Andrea, Chen, Kesi, George, Ruby, Firth, Robert, Agnello, Adriano, Tanaka, Shinnosuke
Format: Preprint
Veröffentlicht: 2025
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2504.07738
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866918501307908096
author Loreti, Andrea
Chen, Kesi
George, Ruby
Firth, Robert
Agnello, Adriano
Tanaka, Shinnosuke
author_facet Loreti, Andrea
Chen, Kesi
George, Ruby
Firth, Robert
Agnello, Adriano
Tanaka, Shinnosuke
contents In this document, we discuss a multi-step approach to automated construction of a knowledge graph, for structuring and representing domain-specific knowledge from large document corpora. We apply our method to build the first knowledge graph of nuclear fusion energy, a highly specialized field characterized by vast scope and heterogeneity. This is an ideal benchmark to test the key features of our pipeline, including automatic named entity recognition and entity resolution. We show how pre-trained large language models can be used to address these challenges and we evaluate their performance against Zipf's law, which characterizes human natural language. Additionally, we develop a knowledge-graph retrieval-augmented generation system that uses multiple prompts with large language models to provide contextually relevant answers to natural-language queries, including complex multi-hop questions requiring reasoning across interconnected entities.
format Preprint
id arxiv_https___arxiv_org_abs_2504_07738
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Automated Construction of a Knowledge Graph of Nuclear Fusion Energy for Effective Elicitation and Retrieval of Information
Loreti, Andrea
Chen, Kesi
George, Ruby
Firth, Robert
Agnello, Adriano
Tanaka, Shinnosuke
Computation and Language
In this document, we discuss a multi-step approach to automated construction of a knowledge graph, for structuring and representing domain-specific knowledge from large document corpora. We apply our method to build the first knowledge graph of nuclear fusion energy, a highly specialized field characterized by vast scope and heterogeneity. This is an ideal benchmark to test the key features of our pipeline, including automatic named entity recognition and entity resolution. We show how pre-trained large language models can be used to address these challenges and we evaluate their performance against Zipf's law, which characterizes human natural language. Additionally, we develop a knowledge-graph retrieval-augmented generation system that uses multiple prompts with large language models to provide contextually relevant answers to natural-language queries, including complex multi-hop questions requiring reasoning across interconnected entities.
title Automated Construction of a Knowledge Graph of Nuclear Fusion Energy for Effective Elicitation and Retrieval of Information
topic Computation and Language
url https://arxiv.org/abs/2504.07738