Saved in:
Bibliographic Details
Main Authors: Shi, Chuancheng, Chen, Yixiang, Lei, Burong, Chen, Jichao
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.13311
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915396505829376
author Shi, Chuancheng
Chen, Yixiang
Lei, Burong
Chen, Jichao
author_facet Shi, Chuancheng
Chen, Yixiang
Lei, Burong
Chen, Jichao
contents Realistic and controllable garment visualization is critical for fashion e-commerce, where users expect personalized previews under diverse poses and lighting conditions. Existing methods often rely on predefined poses, limiting semantic flexibility and illumination adaptability. To address this, we introduce FashionPose, the first unified text-to-pose-to-relighting generation framework. Given a natural language description, our method first predicts a 2D human pose, then employs a diffusion model to generate high-fidelity person images, and finally applies a lightweight relighting module, all guided by the same textual input. By replacing explicit pose annotations with text-driven conditioning, FashionPose enables accurate pose alignment, faithful garment rendering, and flexible lighting control. Experiments demonstrate fine-grained pose synthesis and efficient, consistent relighting, providing a practical solution for personalized virtual fashion display.
format Preprint
id arxiv_https___arxiv_org_abs_2507_13311
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization
Shi, Chuancheng
Chen, Yixiang
Lei, Burong
Chen, Jichao
Computer Vision and Pattern Recognition
Realistic and controllable garment visualization is critical for fashion e-commerce, where users expect personalized previews under diverse poses and lighting conditions. Existing methods often rely on predefined poses, limiting semantic flexibility and illumination adaptability. To address this, we introduce FashionPose, the first unified text-to-pose-to-relighting generation framework. Given a natural language description, our method first predicts a 2D human pose, then employs a diffusion model to generate high-fidelity person images, and finally applies a lightweight relighting module, all guided by the same textual input. By replacing explicit pose annotations with text-driven conditioning, FashionPose enables accurate pose alignment, faithful garment rendering, and flexible lighting control. Experiments demonstrate fine-grained pose synthesis and efficient, consistent relighting, providing a practical solution for personalized virtual fashion display.
title FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2507.13311