Salvato in:
Dettagli Bibliografici
Autori principali: Lee, Hyewon, Oh, Junghyun, Song, Minkyung, Park, Soyoung, Han, Seunghoon
Natura: Preprint
Pubblicazione: 2025
Soggetti:
Accesso online:https://arxiv.org/abs/2510.18499
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866917029520343040
author Lee, Hyewon
Oh, Junghyun
Song, Minkyung
Park, Soyoung
Han, Seunghoon
author_facet Lee, Hyewon
Oh, Junghyun
Song, Minkyung
Park, Soyoung
Han, Seunghoon
contents This study presents the multilingual e-commerce search system developed by the DILAB team, which achieved 5th place on the final leaderboard with a competitive overall score of 0.8819, demonstrating stable and high-performing results across evaluation metrics. To address challenges in multilingual query-item understanding, we designed a multi-stage pipeline integrating data refinement, lightweight preprocessing, and adaptive modeling. The data refinement stage enhanced dataset consistency and category coverage, while language tagging and noise filtering improved input quality. In the modeling phase, multiple architectures and fine-tuning strategies were explored, and hyperparameters optimized using curated validation sets to balance performance across query-category (QC) and query-item (QI) tasks. The proposed framework exhibited robustness and adaptability across languages and domains, highlighting the effectiveness of systematic data curation and iterative evaluation for multilingual search systems. The source code is available at https://github.com/2noweyh/DILAB-Alibaba-Ecommerce-Search.
format Preprint
id arxiv_https___arxiv_org_abs_2510_18499
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Alibaba International E-commerce Product Search Competition DILAB Team Technical Report
Lee, Hyewon
Oh, Junghyun
Song, Minkyung
Park, Soyoung
Han, Seunghoon
Machine Learning
This study presents the multilingual e-commerce search system developed by the DILAB team, which achieved 5th place on the final leaderboard with a competitive overall score of 0.8819, demonstrating stable and high-performing results across evaluation metrics. To address challenges in multilingual query-item understanding, we designed a multi-stage pipeline integrating data refinement, lightweight preprocessing, and adaptive modeling. The data refinement stage enhanced dataset consistency and category coverage, while language tagging and noise filtering improved input quality. In the modeling phase, multiple architectures and fine-tuning strategies were explored, and hyperparameters optimized using curated validation sets to balance performance across query-category (QC) and query-item (QI) tasks. The proposed framework exhibited robustness and adaptability across languages and domains, highlighting the effectiveness of systematic data curation and iterative evaluation for multilingual search systems. The source code is available at https://github.com/2noweyh/DILAB-Alibaba-Ecommerce-Search.
title Alibaba International E-commerce Product Search Competition DILAB Team Technical Report
topic Machine Learning
url https://arxiv.org/abs/2510.18499