Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wang, Yuxiang, Yan, Xiao, Ma, Chi, Huang, Mincong, Li, Xiaoguang, Yu, Lei, Liu, Chuan, Han, Ruidong, Jiang, He, Yin, Bin, Chen, Shangyu, Jiang, Fei, Li, Xiang, Lin, Wei, Han, Haowei, Du, Bo, Jiang, Jiawei
Format:	Preprint
Published:	2025
Subjects:	Distributed, Parallel, and Cluster Computing
Online Access:	https://arxiv.org/abs/2505.12663
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912725707259904
author	Wang, Yuxiang Yan, Xiao Ma, Chi Huang, Mincong Li, Xiaoguang Yu, Lei Liu, Chuan Han, Ruidong Jiang, He Yin, Bin Chen, Shangyu Jiang, Fei Li, Xiang Lin, Wei Han, Haowei Du, Bo Jiang, Jiawei
author_facet	Wang, Yuxiang Yan, Xiao Ma, Chi Huang, Mincong Li, Xiaoguang Yu, Lei Liu, Chuan Han, Ruidong Jiang, He Yin, Bin Chen, Shangyu Jiang, Fei Li, Xiang Lin, Wei Han, Haowei Du, Bo Jiang, Jiawei
contents	Recommendation is crucial for both user experience and company revenue in Meituan as a leading lifestyle company, and generative recommendation models (GRMs) are shown to produce quality recommendations recently. However, existing systems are limited by insufficient functionality support and inefficient implementations for training GRMs in industrial scenarios. As such, we introduce MTGenRec as an efficient and scalable system for GRM training. Specifically, to handle real-time insertions/deletions of sparse embeddings, MTGenRec employs dynamic hash tables to replace static ones. To improve training efficiency, MTGenRec conducts dynamic sequence balancing to address the computation load imbalances among GPUs and adopts feature ID deduplication alongside automatic table merging to accelerate embedding lookup. Extensive experiments show that MTGenRec improves training throughput by $1.6\times -- 2.4\times$ while achieving good scalability when running over 100 GPUs. MTGenRec has been deployed for many applications in Meituan and is now handling hundreds of millions of requests on a daily basis. On the delivery platform, we observe a 1.22% growth in user order volume and a 1.31% enhancement in online PV_CTR.
format	Preprint
id	arxiv_https___arxiv_org_abs_2505_12663
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	MTGenRec: An Efficient Distributed Training System for Generative Recommendation Models in Meituan Wang, Yuxiang Yan, Xiao Ma, Chi Huang, Mincong Li, Xiaoguang Yu, Lei Liu, Chuan Han, Ruidong Jiang, He Yin, Bin Chen, Shangyu Jiang, Fei Li, Xiang Lin, Wei Han, Haowei Du, Bo Jiang, Jiawei Distributed, Parallel, and Cluster Computing Recommendation is crucial for both user experience and company revenue in Meituan as a leading lifestyle company, and generative recommendation models (GRMs) are shown to produce quality recommendations recently. However, existing systems are limited by insufficient functionality support and inefficient implementations for training GRMs in industrial scenarios. As such, we introduce MTGenRec as an efficient and scalable system for GRM training. Specifically, to handle real-time insertions/deletions of sparse embeddings, MTGenRec employs dynamic hash tables to replace static ones. To improve training efficiency, MTGenRec conducts dynamic sequence balancing to address the computation load imbalances among GPUs and adopts feature ID deduplication alongside automatic table merging to accelerate embedding lookup. Extensive experiments show that MTGenRec improves training throughput by $1.6\times -- 2.4\times$ while achieving good scalability when running over 100 GPUs. MTGenRec has been deployed for many applications in Meituan and is now handling hundreds of millions of requests on a daily basis. On the delivery platform, we observe a 1.22% growth in user order volume and a 1.31% enhancement in online PV_CTR.
title	MTGenRec: An Efficient Distributed Training System for Generative Recommendation Models in Meituan
topic	Distributed, Parallel, and Cluster Computing
url	https://arxiv.org/abs/2505.12663

Similar Items