Saved in:
Bibliographic Details
Main Authors: Zhang, Yanan, Bai, Xiaoling, Zhou, Tianhua
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2404.05989
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909163847680000
author Zhang, Yanan
Bai, Xiaoling
Zhou, Tianhua
author_facet Zhang, Yanan
Bai, Xiaoling
Zhou, Tianhua
contents The embedding-based retrieval (EBR) approach is widely used in mainstream search engine retrieval systems and is crucial in recent retrieval-augmented methods for eliminating LLM illusions. However, existing EBR models often face the "semantic drift" problem and insufficient focus on key information, leading to a low adoption rate of retrieval results in subsequent steps. This issue is especially noticeable in real-time search scenarios, where the various expressions of popular events on the Internet make real-time retrieval heavily reliant on crucial event information. To tackle this problem, this paper proposes a novel approach called EER, which enhances real-time retrieval performance by improving the dual-encoder model of traditional EBR. We incorporate contrastive learning to accompany pairwise learning for encoder optimization. Furthermore, to strengthen the focus on critical event information in events, we include a decoder module after the document encoder, introduce a generative event triplet extraction scheme based on prompt-tuning, and correlate the events with query encoder optimization through comparative learning. This decoder module can be removed during inference. Extensive experiments demonstrate that EER can significantly improve the real-time search retrieval performance. We believe that this approach will provide new perspectives in the field of information retrieval. The codes and dataset are available at https://github.com/open-event-hub/Event-enhanced_Retrieval .
format Preprint
id arxiv_https___arxiv_org_abs_2404_05989
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Event-enhanced Retrieval in Real-time Search
Zhang, Yanan
Bai, Xiaoling
Zhou, Tianhua
Computation and Language
Information Retrieval
The embedding-based retrieval (EBR) approach is widely used in mainstream search engine retrieval systems and is crucial in recent retrieval-augmented methods for eliminating LLM illusions. However, existing EBR models often face the "semantic drift" problem and insufficient focus on key information, leading to a low adoption rate of retrieval results in subsequent steps. This issue is especially noticeable in real-time search scenarios, where the various expressions of popular events on the Internet make real-time retrieval heavily reliant on crucial event information. To tackle this problem, this paper proposes a novel approach called EER, which enhances real-time retrieval performance by improving the dual-encoder model of traditional EBR. We incorporate contrastive learning to accompany pairwise learning for encoder optimization. Furthermore, to strengthen the focus on critical event information in events, we include a decoder module after the document encoder, introduce a generative event triplet extraction scheme based on prompt-tuning, and correlate the events with query encoder optimization through comparative learning. This decoder module can be removed during inference. Extensive experiments demonstrate that EER can significantly improve the real-time search retrieval performance. We believe that this approach will provide new perspectives in the field of information retrieval. The codes and dataset are available at https://github.com/open-event-hub/Event-enhanced_Retrieval .
title Event-enhanced Retrieval in Real-time Search
topic Computation and Language
Information Retrieval
url https://arxiv.org/abs/2404.05989