Saved in:
Bibliographic Details
Main Authors: Potdar, Saloni, Lee, Daniel, Attia, Omar, Embar, Varun, Meng, De, Balaji, Ramesh, Seivwright, Chloe, Choi, Eric, Farid, Mina H., Sun, Yiwen, Li, Yunyao
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.17270
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910803230195712
author Potdar, Saloni
Lee, Daniel
Attia, Omar
Embar, Varun
Meng, De
Balaji, Ramesh
Seivwright, Chloe
Choi, Eric
Farid, Mina H.
Sun, Yiwen
Li, Yunyao
author_facet Potdar, Saloni
Lee, Daniel
Attia, Omar
Embar, Varun
Meng, De
Balaji, Ramesh
Seivwright, Chloe
Choi, Eric
Farid, Mina H.
Sun, Yiwen
Li, Yunyao
contents Question answering systems for knowledge graph (KGQA), answer factoid questions based on the data in the knowledge graph. KGQA systems are complex because the system has to understand the relations and entities in the knowledge-seeking natural language queries and map them to structured queries against the KG to answer them. In this paper, we introduce Chronos, a comprehensive evaluation framework for KGQA at industry scale. It is designed to evaluate such a multi-component system comprehensively, focusing on (1) end-to-end and component-level metrics, (2) scalable to diverse datasets and (3) a scalable approach to measure the performance of the system prior to release. In this paper, we discuss the unique challenges associated with evaluating KGQA systems at industry scale, review the design of Chronos, and how it addresses these challenges. We will demonstrate how it provides a base for data-driven decisions and discuss the challenges of using it to measure and improve a real-world KGQA system.
format Preprint
id arxiv_https___arxiv_org_abs_2501_17270
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Comprehensive Evaluation for a Large Scale Knowledge Graph Question Answering Service
Potdar, Saloni
Lee, Daniel
Attia, Omar
Embar, Varun
Meng, De
Balaji, Ramesh
Seivwright, Chloe
Choi, Eric
Farid, Mina H.
Sun, Yiwen
Li, Yunyao
Computation and Language
Databases
Question answering systems for knowledge graph (KGQA), answer factoid questions based on the data in the knowledge graph. KGQA systems are complex because the system has to understand the relations and entities in the knowledge-seeking natural language queries and map them to structured queries against the KG to answer them. In this paper, we introduce Chronos, a comprehensive evaluation framework for KGQA at industry scale. It is designed to evaluate such a multi-component system comprehensively, focusing on (1) end-to-end and component-level metrics, (2) scalable to diverse datasets and (3) a scalable approach to measure the performance of the system prior to release. In this paper, we discuss the unique challenges associated with evaluating KGQA systems at industry scale, review the design of Chronos, and how it addresses these challenges. We will demonstrate how it provides a base for data-driven decisions and discuss the challenges of using it to measure and improve a real-world KGQA system.
title Comprehensive Evaluation for a Large Scale Knowledge Graph Question Answering Service
topic Computation and Language
Databases
url https://arxiv.org/abs/2501.17270