Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhu, Jinchao, Wang, Yuxuan, Pan, Siyuan, Wan, Pengfei, Zhang, Di, Huang, Gao
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2406.00210
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911919889186816
author	Zhu, Jinchao Wang, Yuxuan Pan, Siyuan Wan, Pengfei Zhang, Di Huang, Gao
author_facet	Zhu, Jinchao Wang, Yuxuan Pan, Siyuan Wan, Pengfei Zhang, Di Huang, Gao
contents	The Stable Diffusion Model (SDM) is a prevalent and effective model for text-to-image (T2I) and image-to-image (I2I) generation. Despite various attempts at sampler optimization, model distillation, and network quantification, these approaches typically maintain the original network architecture. The extensive parameter scale and substantial computational demands have limited research into adjusting the model architecture. This study focuses on reducing redundant computation in SDM and optimizes the model through both tuning and tuning-free methods. 1) For the tuning method, we design a model assembly strategy to reconstruct a lightweight model while preserving performance through distillation. Second, to mitigate performance loss due to pruning, we incorporate multi-expert conditional convolution (ME-CondConv) into compressed UNets to enhance network performance by increasing capacity without sacrificing speed. Third, we validate the effectiveness of the multi-UNet switching method for improving network speed. 2) For the tuning-free method, we propose a feature inheritance strategy to accelerate inference by skipping local computations at the block, layer, or unit level within the network structure. We also examine multiple sampling modes for feature inheritance at the time-step level. Experiments demonstrate that both the proposed tuning and the tuning-free methods can improve the speed and performance of the SDM. The lightweight model reconstructed by the model assembly strategy increases generation speed by $22.4%$, while the feature inheritance strategy enhances the SDM generation speed by $40.0%$.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_00210
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	A-SDM: Accelerating Stable Diffusion through Model Assembly and Feature Inheritance Strategies Zhu, Jinchao Wang, Yuxuan Pan, Siyuan Wan, Pengfei Zhang, Di Huang, Gao Computer Vision and Pattern Recognition The Stable Diffusion Model (SDM) is a prevalent and effective model for text-to-image (T2I) and image-to-image (I2I) generation. Despite various attempts at sampler optimization, model distillation, and network quantification, these approaches typically maintain the original network architecture. The extensive parameter scale and substantial computational demands have limited research into adjusting the model architecture. This study focuses on reducing redundant computation in SDM and optimizes the model through both tuning and tuning-free methods. 1) For the tuning method, we design a model assembly strategy to reconstruct a lightweight model while preserving performance through distillation. Second, to mitigate performance loss due to pruning, we incorporate multi-expert conditional convolution (ME-CondConv) into compressed UNets to enhance network performance by increasing capacity without sacrificing speed. Third, we validate the effectiveness of the multi-UNet switching method for improving network speed. 2) For the tuning-free method, we propose a feature inheritance strategy to accelerate inference by skipping local computations at the block, layer, or unit level within the network structure. We also examine multiple sampling modes for feature inheritance at the time-step level. Experiments demonstrate that both the proposed tuning and the tuning-free methods can improve the speed and performance of the SDM. The lightweight model reconstructed by the model assembly strategy increases generation speed by $22.4%$, while the feature inheritance strategy enhances the SDM generation speed by $40.0%$.
title	A-SDM: Accelerating Stable Diffusion through Model Assembly and Feature Inheritance Strategies
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2406.00210

Similar Items