Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Qiang, Teng, Zhipeng, Wu, Disheng, Wang, Jiayin
Format:	Preprint
Published:	2024
Subjects:	Information Retrieval Machine Learning C.3, H.3.3
Online Access:	https://arxiv.org/abs/2409.00400
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910684748447744
author	Zhang, Qiang Teng, Zhipeng Wu, Disheng Wang, Jiayin
author_facet	Zhang, Qiang Teng, Zhipeng Wu, Disheng Wang, Jiayin
contents	In industrial recommendation systems on websites and apps, it is essential to recall and predict top-n results relevant to user interests from a content pool of billions within milliseconds. To cope with continuous data growth and improve real-time recommendation performance, we have designed and implemented a high-performance batch query architecture for real-time recommendation systems. Our contributions include optimizing hash structures with a cacheline-aware probing method to enhance coalesced hashing, as well as the implementation of a hybrid storage key-value service built upon it. Our experiments indicate this approach significantly surpasses conventional hash tables in batch query throughput, achieving up to 90% of the query throughput of random memory access when incorporating parallel optimization. The support for NVMe, integrating two-tier storage for hot and cold data, notably reduces resource consumption. Additionally, the system facilitates dynamic updates, automated sharding of attributes and feature embedding tables, and introduces innovative protocols for consistency in batch queries, thereby enhancing the effectiveness of real-time incremental learning updates. This architecture has been deployed and in use in the bilibili recommendation system for over a year, a video content community with hundreds of millions of users, supporting 10x increase in model computation with minimal resource growth, improving outcomes while preserving the system's real-time performance.
format	Preprint
id	arxiv_https___arxiv_org_abs_2409_00400
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	An Enhanced Batch Query Architecture in Real-time Recommendation Zhang, Qiang Teng, Zhipeng Wu, Disheng Wang, Jiayin Information Retrieval Machine Learning C.3, H.3.3 In industrial recommendation systems on websites and apps, it is essential to recall and predict top-n results relevant to user interests from a content pool of billions within milliseconds. To cope with continuous data growth and improve real-time recommendation performance, we have designed and implemented a high-performance batch query architecture for real-time recommendation systems. Our contributions include optimizing hash structures with a cacheline-aware probing method to enhance coalesced hashing, as well as the implementation of a hybrid storage key-value service built upon it. Our experiments indicate this approach significantly surpasses conventional hash tables in batch query throughput, achieving up to 90% of the query throughput of random memory access when incorporating parallel optimization. The support for NVMe, integrating two-tier storage for hot and cold data, notably reduces resource consumption. Additionally, the system facilitates dynamic updates, automated sharding of attributes and feature embedding tables, and introduces innovative protocols for consistency in batch queries, thereby enhancing the effectiveness of real-time incremental learning updates. This architecture has been deployed and in use in the bilibili recommendation system for over a year, a video content community with hundreds of millions of users, supporting 10x increase in model computation with minimal resource growth, improving outcomes while preserving the system's real-time performance.
title	An Enhanced Batch Query Architecture in Real-time Recommendation
topic	Information Retrieval Machine Learning C.3, H.3.3
url	https://arxiv.org/abs/2409.00400

Similar Items