Saved in:
Bibliographic Details
Main Authors: Xu, Enqiang, Li, Xinhui, Zhou, Zhigong, Ji, Jiahao, Zhao, Jinyuan, Miao, Dadong, Wang, Songlin, Liu, Lin, Xu, Sulong
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2408.05751
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • In the rapidly evolving field of e-commerce, the effectiveness of search re-ranking models is crucial for enhancing user experience and driving conversion rates. Despite significant advancements in feature representation and model architecture, the integration of multimodal information remains underexplored. This study addresses this gap by investigating the computation and fusion of textual and visual information in the context of re-ranking. We propose \textbf{A}dvancing \textbf{R}e-Ranking with \textbf{M}ulti\textbf{m}odal Fusion and \textbf{T}arget-Oriented Auxiliary Tasks (ARMMT), which integrates an attention-based multimodal fusion technique and an auxiliary ranking-aligned task to enhance item representation and improve targeting capabilities. This method not only enriches the understanding of product attributes but also enables more precise and personalized recommendations. Experimental evaluations on JD.com's search platform demonstrate that ARMMT achieves state-of-the-art performance in multimodal information integration, evidenced by a 0.22\% increase in the Conversion Rate (CVR), significantly contributing to Gross Merchandise Volume (GMV). This pioneering approach has the potential to revolutionize e-commerce re-ranking, leading to elevated user satisfaction and business growth.