Saved in:
Bibliographic Details
Main Authors: Zhang, Qiang, Teng, Zhipeng, Wu, Disheng, Wang, Jiayin
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.00400
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910684748447744
author Zhang, Qiang
Teng, Zhipeng
Wu, Disheng
Wang, Jiayin
author_facet Zhang, Qiang
Teng, Zhipeng
Wu, Disheng
Wang, Jiayin
contents In industrial recommendation systems on websites and apps, it is essential to recall and predict top-n results relevant to user interests from a content pool of billions within milliseconds. To cope with continuous data growth and improve real-time recommendation performance, we have designed and implemented a high-performance batch query architecture for real-time recommendation systems. Our contributions include optimizing hash structures with a cacheline-aware probing method to enhance coalesced hashing, as well as the implementation of a hybrid storage key-value service built upon it. Our experiments indicate this approach significantly surpasses conventional hash tables in batch query throughput, achieving up to 90% of the query throughput of random memory access when incorporating parallel optimization. The support for NVMe, integrating two-tier storage for hot and cold data, notably reduces resource consumption. Additionally, the system facilitates dynamic updates, automated sharding of attributes and feature embedding tables, and introduces innovative protocols for consistency in batch queries, thereby enhancing the effectiveness of real-time incremental learning updates. This architecture has been deployed and in use in the bilibili recommendation system for over a year, a video content community with hundreds of millions of users, supporting 10x increase in model computation with minimal resource growth, improving outcomes while preserving the system's real-time performance.
format Preprint
id arxiv_https___arxiv_org_abs_2409_00400
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle An Enhanced Batch Query Architecture in Real-time Recommendation
Zhang, Qiang
Teng, Zhipeng
Wu, Disheng
Wang, Jiayin
Information Retrieval
Machine Learning
C.3, H.3.3
In industrial recommendation systems on websites and apps, it is essential to recall and predict top-n results relevant to user interests from a content pool of billions within milliseconds. To cope with continuous data growth and improve real-time recommendation performance, we have designed and implemented a high-performance batch query architecture for real-time recommendation systems. Our contributions include optimizing hash structures with a cacheline-aware probing method to enhance coalesced hashing, as well as the implementation of a hybrid storage key-value service built upon it. Our experiments indicate this approach significantly surpasses conventional hash tables in batch query throughput, achieving up to 90% of the query throughput of random memory access when incorporating parallel optimization. The support for NVMe, integrating two-tier storage for hot and cold data, notably reduces resource consumption. Additionally, the system facilitates dynamic updates, automated sharding of attributes and feature embedding tables, and introduces innovative protocols for consistency in batch queries, thereby enhancing the effectiveness of real-time incremental learning updates. This architecture has been deployed and in use in the bilibili recommendation system for over a year, a video content community with hundreds of millions of users, supporting 10x increase in model computation with minimal resource growth, improving outcomes while preserving the system's real-time performance.
title An Enhanced Batch Query Architecture in Real-time Recommendation
topic Information Retrieval
Machine Learning
C.3, H.3.3
url https://arxiv.org/abs/2409.00400