Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Hauri, Yannick, Lanzendörfer, Luca A., Aczel, Till
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2510.00633
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918152362786816
author	Hauri, Yannick Lanzendörfer, Luca A. Aczel, Till
author_facet	Hauri, Yannick Lanzendörfer, Luca A. Aczel, Till
contents	Fashion image generation has so far focused on narrow tasks such as virtual try-on, where garments appear in clean studio environments. In contrast, editorial fashion presents garments through dynamic poses, diverse locations, and carefully crafted visual narratives. We introduce the task of virtual fashion photo-shoot, which seeks to capture this richness by transforming standardized garment images into contextually grounded editorial imagery. To enable this new direction, we construct the first large-scale dataset of garment-lookbook pairs, bridging the gap between e-commerce and fashion media. Because such pairs are not readily available, we design an automated retrieval pipeline that aligns garments across domains, combining visual-language reasoning with object-level localization. We construct a dataset with three garment-lookbook pair accuracy levels: high quality (10,000 pairs), medium quality (50,000 pairs), and low quality (300,000 pairs). This dataset offers a foundation for models that move beyond catalog-style generation and toward fashion imagery that reflects creativity, atmosphere, and storytelling.
format	Preprint
id	arxiv_https___arxiv_org_abs_2510_00633
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Virtual Fashion Photo-Shoots: Building a Large-Scale Garment-Lookbook Dataset Hauri, Yannick Lanzendörfer, Luca A. Aczel, Till Computer Vision and Pattern Recognition Machine Learning Fashion image generation has so far focused on narrow tasks such as virtual try-on, where garments appear in clean studio environments. In contrast, editorial fashion presents garments through dynamic poses, diverse locations, and carefully crafted visual narratives. We introduce the task of virtual fashion photo-shoot, which seeks to capture this richness by transforming standardized garment images into contextually grounded editorial imagery. To enable this new direction, we construct the first large-scale dataset of garment-lookbook pairs, bridging the gap between e-commerce and fashion media. Because such pairs are not readily available, we design an automated retrieval pipeline that aligns garments across domains, combining visual-language reasoning with object-level localization. We construct a dataset with three garment-lookbook pair accuracy levels: high quality (10,000 pairs), medium quality (50,000 pairs), and low quality (300,000 pairs). This dataset offers a foundation for models that move beyond catalog-style generation and toward fashion imagery that reflects creativity, atmosphere, and storytelling.
title	Virtual Fashion Photo-Shoots: Building a Large-Scale Garment-Lookbook Dataset
topic	Computer Vision and Pattern Recognition Machine Learning
url	https://arxiv.org/abs/2510.00633

Similar Items