Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhao, Zhen, Zhang, Tong, Xu, Jie, Cai, Qingliang, Zhang, Qile, Yang, Leyuan, Xiao, Daorui, Chang, Xiaojia
Format:	Preprint
Published:	2026
Subjects:	Information Retrieval
Online Access:	https://arxiv.org/abs/2601.22694
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910005971648512
author	Zhao, Zhen Zhang, Tong Xu, Jie Cai, Qingliang Zhang, Qile Yang, Leyuan Xiao, Daorui Chang, Xiaojia
author_facet	Zhao, Zhen Zhang, Tong Xu, Jie Cai, Qingliang Zhang, Qile Yang, Leyuan Xiao, Daorui Chang, Xiaojia
contents	Recent studies on scaling up ranking models have achieved substantial improvement for recommendation systems and search engines. However, most large-scale ranking systems rely on item IDs, where each item is treated as an independent categorical symbol and mapped to a learned embedding. As items rapidly appear and disappear, these embeddings become difficult to train and maintain. This instability impedes effective learning of neural network parameters and limits the scalability of ranking models. In this paper, we show that semantic tokens possess greater scaling potential compared to item IDs. Our proposed framework TRM improves the token generation and application pipeline, leading to 33% reduction in sparse storage while achieving 0.85% AUC increase. Extensive experiments further show that TRM could consistently outperform state-of-the-art models when model capacity scales. Finally, TRM has been successfully deployed on large-scale personalized search engines, yielding 0.26% and 0.75% improvement on user active days and change query ratio respectively through A/B test.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_22694
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Farewell to Item IDs: Unlocking the Scaling Potential of Large Ranking Models via Semantic Tokens Zhao, Zhen Zhang, Tong Xu, Jie Cai, Qingliang Zhang, Qile Yang, Leyuan Xiao, Daorui Chang, Xiaojia Information Retrieval Recent studies on scaling up ranking models have achieved substantial improvement for recommendation systems and search engines. However, most large-scale ranking systems rely on item IDs, where each item is treated as an independent categorical symbol and mapped to a learned embedding. As items rapidly appear and disappear, these embeddings become difficult to train and maintain. This instability impedes effective learning of neural network parameters and limits the scalability of ranking models. In this paper, we show that semantic tokens possess greater scaling potential compared to item IDs. Our proposed framework TRM improves the token generation and application pipeline, leading to 33% reduction in sparse storage while achieving 0.85% AUC increase. Extensive experiments further show that TRM could consistently outperform state-of-the-art models when model capacity scales. Finally, TRM has been successfully deployed on large-scale personalized search engines, yielding 0.26% and 0.75% improvement on user active days and change query ratio respectively through A/B test.
title	Farewell to Item IDs: Unlocking the Scaling Potential of Large Ranking Models via Semantic Tokens
topic	Information Retrieval
url	https://arxiv.org/abs/2601.22694

Similar Items