Saved in:
Bibliographic Details
Main Authors: Zhao, Zhen, Zhang, Tong, Xu, Jie, Cai, Qingliang, Zhang, Qile, Yang, Leyuan, Xiao, Daorui, Chang, Xiaojia
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.22694
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910005971648512
author Zhao, Zhen
Zhang, Tong
Xu, Jie
Cai, Qingliang
Zhang, Qile
Yang, Leyuan
Xiao, Daorui
Chang, Xiaojia
author_facet Zhao, Zhen
Zhang, Tong
Xu, Jie
Cai, Qingliang
Zhang, Qile
Yang, Leyuan
Xiao, Daorui
Chang, Xiaojia
contents Recent studies on scaling up ranking models have achieved substantial improvement for recommendation systems and search engines. However, most large-scale ranking systems rely on item IDs, where each item is treated as an independent categorical symbol and mapped to a learned embedding. As items rapidly appear and disappear, these embeddings become difficult to train and maintain. This instability impedes effective learning of neural network parameters and limits the scalability of ranking models. In this paper, we show that semantic tokens possess greater scaling potential compared to item IDs. Our proposed framework TRM improves the token generation and application pipeline, leading to 33% reduction in sparse storage while achieving 0.85% AUC increase. Extensive experiments further show that TRM could consistently outperform state-of-the-art models when model capacity scales. Finally, TRM has been successfully deployed on large-scale personalized search engines, yielding 0.26% and 0.75% improvement on user active days and change query ratio respectively through A/B test.
format Preprint
id arxiv_https___arxiv_org_abs_2601_22694
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Farewell to Item IDs: Unlocking the Scaling Potential of Large Ranking Models via Semantic Tokens
Zhao, Zhen
Zhang, Tong
Xu, Jie
Cai, Qingliang
Zhang, Qile
Yang, Leyuan
Xiao, Daorui
Chang, Xiaojia
Information Retrieval
Recent studies on scaling up ranking models have achieved substantial improvement for recommendation systems and search engines. However, most large-scale ranking systems rely on item IDs, where each item is treated as an independent categorical symbol and mapped to a learned embedding. As items rapidly appear and disappear, these embeddings become difficult to train and maintain. This instability impedes effective learning of neural network parameters and limits the scalability of ranking models. In this paper, we show that semantic tokens possess greater scaling potential compared to item IDs. Our proposed framework TRM improves the token generation and application pipeline, leading to 33% reduction in sparse storage while achieving 0.85% AUC increase. Extensive experiments further show that TRM could consistently outperform state-of-the-art models when model capacity scales. Finally, TRM has been successfully deployed on large-scale personalized search engines, yielding 0.26% and 0.75% improvement on user active days and change query ratio respectively through A/B test.
title Farewell to Item IDs: Unlocking the Scaling Potential of Large Ranking Models via Semantic Tokens
topic Information Retrieval
url https://arxiv.org/abs/2601.22694