Saved in:
Bibliographic Details
Main Authors: Ramsey, Spencer, Grant, Amina, Lee, Jeffrey
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.15571
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909466588348416
author Ramsey, Spencer
Grant, Amina
Lee, Jeffrey
author_facet Ramsey, Spencer
Grant, Amina
Lee, Jeffrey
contents Fashion content generation is an emerging area at the intersection of artificial intelligence and creative design, with applications ranging from virtual try-on to culturally diverse design prototyping. Existing methods often struggle with cultural bias, limited scalability, and alignment between textual prompts and generated visuals, particularly under weak supervision. In this work, we propose a novel framework that integrates Large Language Models (LLMs) with Latent Diffusion Models (LDMs) to address these challenges. Our method leverages LLMs for semantic refinement of textual prompts and introduces a weak supervision filtering module to effectively utilize noisy or weakly labeled data. By fine-tuning the LDM on an enhanced DeepFashion+ dataset enriched with global fashion styles, the proposed approach achieves state-of-the-art performance. Experimental results demonstrate that our method significantly outperforms baselines, achieving lower Frechet Inception Distance (FID) and higher Inception Scores (IS), while human evaluations confirm its ability to generate culturally diverse and semantically relevant fashion content. These results highlight the potential of LLM-guided diffusion models in driving scalable and inclusive AI-driven fashion innovation.
format Preprint
id arxiv_https___arxiv_org_abs_2501_15571
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Cross-Cultural Fashion Design via Interactive Large Language Models and Diffusion Models
Ramsey, Spencer
Grant, Amina
Lee, Jeffrey
Computation and Language
Fashion content generation is an emerging area at the intersection of artificial intelligence and creative design, with applications ranging from virtual try-on to culturally diverse design prototyping. Existing methods often struggle with cultural bias, limited scalability, and alignment between textual prompts and generated visuals, particularly under weak supervision. In this work, we propose a novel framework that integrates Large Language Models (LLMs) with Latent Diffusion Models (LDMs) to address these challenges. Our method leverages LLMs for semantic refinement of textual prompts and introduces a weak supervision filtering module to effectively utilize noisy or weakly labeled data. By fine-tuning the LDM on an enhanced DeepFashion+ dataset enriched with global fashion styles, the proposed approach achieves state-of-the-art performance. Experimental results demonstrate that our method significantly outperforms baselines, achieving lower Frechet Inception Distance (FID) and higher Inception Scores (IS), while human evaluations confirm its ability to generate culturally diverse and semantically relevant fashion content. These results highlight the potential of LLM-guided diffusion models in driving scalable and inclusive AI-driven fashion innovation.
title Cross-Cultural Fashion Design via Interactive Large Language Models and Diffusion Models
topic Computation and Language
url https://arxiv.org/abs/2501.15571