Saved in:
Bibliographic Details
Main Authors: Hurtado, Julio, Ni, Haoran, Sap, Duygu, Mattinson, Connor, Lotz, Martin
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.24477
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908565355102208
author Hurtado, Julio
Ni, Haoran
Sap, Duygu
Mattinson, Connor
Lotz, Martin
author_facet Hurtado, Julio
Ni, Haoran
Sap, Duygu
Mattinson, Connor
Lotz, Martin
contents The fashion industry has been identified as a major contributor to waste and emissions, leading to an increased interest in promoting the second-hand market. Machine learning methods play an important role in facilitating the creation and expansion of second-hand marketplaces by enabling the large-scale valuation of used garments. We contribute to this line of work by addressing the scalability of second-hand image retrieval from databases. By introducing a selective representation framework, we can shrink databases to 10% of their original size without sacrificing retrieval accuracy. We first explore clustering and coreset selection methods to identify representative samples that capture the key features of each garment and its internal variability. Then, we introduce an efficient outlier removal method, based on a neighbour-homogeneity consistency score measure, that filters out uncharacteristic samples prior to selection. We evaluate our approach on three public datasets: DeepFashion Attribute, DeepFashion Con2Shop, and DeepFashion2. The results demonstrate a clear performance-efficiency trade-off by strategically pruning and selecting representative vectors of images. The retrieval system maintains near-optimal accuracy, while greatly reducing computational costs by reducing the images added to the vector database. Furthermore, applying our outlier removal method to clustering techniques yields even higher retrieval performance by removing non-discriminative samples before the selection.
format Preprint
id arxiv_https___arxiv_org_abs_2509_24477
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Performance-Efficiency Trade-off for Fashion Image Retrieval
Hurtado, Julio
Ni, Haoran
Sap, Duygu
Mattinson, Connor
Lotz, Martin
Computer Vision and Pattern Recognition
The fashion industry has been identified as a major contributor to waste and emissions, leading to an increased interest in promoting the second-hand market. Machine learning methods play an important role in facilitating the creation and expansion of second-hand marketplaces by enabling the large-scale valuation of used garments. We contribute to this line of work by addressing the scalability of second-hand image retrieval from databases. By introducing a selective representation framework, we can shrink databases to 10% of their original size without sacrificing retrieval accuracy. We first explore clustering and coreset selection methods to identify representative samples that capture the key features of each garment and its internal variability. Then, we introduce an efficient outlier removal method, based on a neighbour-homogeneity consistency score measure, that filters out uncharacteristic samples prior to selection. We evaluate our approach on three public datasets: DeepFashion Attribute, DeepFashion Con2Shop, and DeepFashion2. The results demonstrate a clear performance-efficiency trade-off by strategically pruning and selecting representative vectors of images. The retrieval system maintains near-optimal accuracy, while greatly reducing computational costs by reducing the images added to the vector database. Furthermore, applying our outlier removal method to clustering techniques yields even higher retrieval performance by removing non-discriminative samples before the selection.
title Performance-Efficiency Trade-off for Fashion Image Retrieval
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2509.24477