Saved in:
Bibliographic Details
Main Authors: Zheng, Chujie, Wang, Jeffrey, Zhang, Shuqian Albee, Kishore, Anand, Singh, Siddharth
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2410.21549
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • We propose a novel method for evaluating the performance of a content search system that measures the semantic match between a query and the results returned by the search system. We introduce a metric called "on-topic rate" to measure the percentage of results that are relevant to the query. To achieve this, we design a pipeline that defines a golden query set, retrieves the top K results for each query, and sends calls to GPT 3.5 with formulated prompts. Our semantic evaluation pipeline helps identify common failure patterns and goals against the metric for relevance improvements.