Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Tóth, Sándor, Wilson, Stephen, Tsoukara, Alexia, Moreu, Enric, Masalovich, Anton, Roemheld, Lars
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2403.11593
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Product matching, the task of identifying different representations of the same product for better discoverability, curation, and pricing, is a key capability for online marketplace and e-commerce companies. We present a robust multi-modal product matching system in an industry setting, where large datasets, data distribution shifts and unseen domains pose challenges. We compare different approaches and conclude that a relatively straightforward projection of pretrained image and text encoders, trained through contrastive learning, yields state-of-the-art results, while balancing cost and performance. Our solution outperforms single modality matching systems and large pretrained models, such as CLIP. Furthermore we show how a human-in-the-loop process can be combined with model-based predictions to achieve near perfect precision in a production system.

Similar Items