Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Liu, Shuang, Yu, Ao, Cheng, Linkang, Huang, Xiwen, Zhao, Li, Liu, Junhui, Lin, Zhiting, Liu, Yu
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2603.09236
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910047362088960
author	Liu, Shuang Yu, Ao Cheng, Linkang Huang, Xiwen Zhao, Li Liu, Junhui Lin, Zhiting Liu, Yu
author_facet	Liu, Shuang Yu, Ao Cheng, Linkang Huang, Xiwen Zhao, Li Liu, Junhui Lin, Zhiting Liu, Yu
contents	Virtual try-off (VTOFF) aims to recover canonical flat-garment representations from images of dressed persons for standardized display and downstream virtual try-on. Prior methods often treat VTOFF as direct image translation driven by local masks or text-only prompts, overlooking the gap between on-body appearances and flat layouts. This gap frequently leads to inconsistent completion in unobserved regions and unstable garment structure. We propose BridgeDiff, a diffusion-based framework that explicitly bridges human-centric observations and flat-garment synthesis through two complementary components. First, the Garment Condition Bridge Module (GCBM) builds a garment-cue representation that captures global appearance and semantic identity, enabling robust inference of continuous details under partial visibility. Second, the Flat Structure Constraint Module (FSCM) injects explicit flat-garment structural priors via Flat-Constraint Attention (FC-Attention) at selected denoising stages, improving structural stability beyond text-only conditioning. Extensive experiments on standard VTOFF benchmarks show that BridgeDiff achieves state-of-the-art performance, producing higher-quality flat-garment reconstructions while preserving fine-grained appearance and structural integrity.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_09236
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	BridgeDiff: Bridging Human Observations and Flat-Garment Synthesis for Virtual Try-Off Liu, Shuang Yu, Ao Cheng, Linkang Huang, Xiwen Zhao, Li Liu, Junhui Lin, Zhiting Liu, Yu Computer Vision and Pattern Recognition Artificial Intelligence Virtual try-off (VTOFF) aims to recover canonical flat-garment representations from images of dressed persons for standardized display and downstream virtual try-on. Prior methods often treat VTOFF as direct image translation driven by local masks or text-only prompts, overlooking the gap between on-body appearances and flat layouts. This gap frequently leads to inconsistent completion in unobserved regions and unstable garment structure. We propose BridgeDiff, a diffusion-based framework that explicitly bridges human-centric observations and flat-garment synthesis through two complementary components. First, the Garment Condition Bridge Module (GCBM) builds a garment-cue representation that captures global appearance and semantic identity, enabling robust inference of continuous details under partial visibility. Second, the Flat Structure Constraint Module (FSCM) injects explicit flat-garment structural priors via Flat-Constraint Attention (FC-Attention) at selected denoising stages, improving structural stability beyond text-only conditioning. Extensive experiments on standard VTOFF benchmarks show that BridgeDiff achieves state-of-the-art performance, producing higher-quality flat-garment reconstructions while preserving fine-grained appearance and structural integrity.
title	BridgeDiff: Bridging Human Observations and Flat-Garment Synthesis for Virtual Try-Off
topic	Computer Vision and Pattern Recognition Artificial Intelligence
url	https://arxiv.org/abs/2603.09236

Similar Items