Saved in:
Bibliographic Details
Main Authors: Wang, Yuxiang, Yan, Xiao, Ma, Chi, Huang, Mincong, Li, Xiaoguang, Yu, Lei, Liu, Chuan, Han, Ruidong, Jiang, He, Yin, Bin, Chen, Shangyu, Jiang, Fei, Li, Xiang, Lin, Wei, Han, Haowei, Du, Bo, Jiang, Jiawei
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2505.12663
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912725707259904
author Wang, Yuxiang
Yan, Xiao
Ma, Chi
Huang, Mincong
Li, Xiaoguang
Yu, Lei
Liu, Chuan
Han, Ruidong
Jiang, He
Yin, Bin
Chen, Shangyu
Jiang, Fei
Li, Xiang
Lin, Wei
Han, Haowei
Du, Bo
Jiang, Jiawei
author_facet Wang, Yuxiang
Yan, Xiao
Ma, Chi
Huang, Mincong
Li, Xiaoguang
Yu, Lei
Liu, Chuan
Han, Ruidong
Jiang, He
Yin, Bin
Chen, Shangyu
Jiang, Fei
Li, Xiang
Lin, Wei
Han, Haowei
Du, Bo
Jiang, Jiawei
contents Recommendation is crucial for both user experience and company revenue in Meituan as a leading lifestyle company, and generative recommendation models (GRMs) are shown to produce quality recommendations recently. However, existing systems are limited by insufficient functionality support and inefficient implementations for training GRMs in industrial scenarios. As such, we introduce MTGenRec as an efficient and scalable system for GRM training. Specifically, to handle real-time insertions/deletions of sparse embeddings, MTGenRec employs dynamic hash tables to replace static ones. To improve training efficiency, MTGenRec conducts dynamic sequence balancing to address the computation load imbalances among GPUs and adopts feature ID deduplication alongside automatic table merging to accelerate embedding lookup. Extensive experiments show that MTGenRec improves training throughput by $1.6\times -- 2.4\times$ while achieving good scalability when running over 100 GPUs. MTGenRec has been deployed for many applications in Meituan and is now handling hundreds of millions of requests on a daily basis. On the delivery platform, we observe a 1.22% growth in user order volume and a 1.31% enhancement in online PV_CTR.
format Preprint
id arxiv_https___arxiv_org_abs_2505_12663
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle MTGenRec: An Efficient Distributed Training System for Generative Recommendation Models in Meituan
Wang, Yuxiang
Yan, Xiao
Ma, Chi
Huang, Mincong
Li, Xiaoguang
Yu, Lei
Liu, Chuan
Han, Ruidong
Jiang, He
Yin, Bin
Chen, Shangyu
Jiang, Fei
Li, Xiang
Lin, Wei
Han, Haowei
Du, Bo
Jiang, Jiawei
Distributed, Parallel, and Cluster Computing
Recommendation is crucial for both user experience and company revenue in Meituan as a leading lifestyle company, and generative recommendation models (GRMs) are shown to produce quality recommendations recently. However, existing systems are limited by insufficient functionality support and inefficient implementations for training GRMs in industrial scenarios. As such, we introduce MTGenRec as an efficient and scalable system for GRM training. Specifically, to handle real-time insertions/deletions of sparse embeddings, MTGenRec employs dynamic hash tables to replace static ones. To improve training efficiency, MTGenRec conducts dynamic sequence balancing to address the computation load imbalances among GPUs and adopts feature ID deduplication alongside automatic table merging to accelerate embedding lookup. Extensive experiments show that MTGenRec improves training throughput by $1.6\times -- 2.4\times$ while achieving good scalability when running over 100 GPUs. MTGenRec has been deployed for many applications in Meituan and is now handling hundreds of millions of requests on a daily basis. On the delivery platform, we observe a 1.22% growth in user order volume and a 1.31% enhancement in online PV_CTR.
title MTGenRec: An Efficient Distributed Training System for Generative Recommendation Models in Meituan
topic Distributed, Parallel, and Cluster Computing
url https://arxiv.org/abs/2505.12663