MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	You, Kaichao, Qin, Guo, Bao, Anchang, Cao, Meng, Huang, Ping, Shan, Jiulong, Long, Mingsheng
Natura:	Preprint
Pubblicazione:	2023
Soggetti:	Artificial Intelligence
Accesso online:	https://arxiv.org/abs/2305.11624
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866929258527457280
author	You, Kaichao Qin, Guo Bao, Anchang Cao, Meng Huang, Ping Shan, Jiulong Long, Mingsheng
author_facet	You, Kaichao Qin, Guo Bao, Anchang Cao, Meng Huang, Ping Shan, Jiulong Long, Mingsheng
contents	Convolution-BatchNorm (ConvBN) blocks are integral components in various computer vision tasks and other domains. A ConvBN block can operate in three modes: Train, Eval, and Deploy. While the Train mode is indispensable for training models from scratch, the Eval mode is suitable for transfer learning and beyond, and the Deploy mode is designed for the deployment of models. This paper focuses on the trade-off between stability and efficiency in ConvBN blocks: Deploy mode is efficient but suffers from training instability; Eval mode is widely used in transfer learning but lacks efficiency. To solve the dilemma, we theoretically reveal the reason behind the diminished training stability observed in the Deploy mode. Subsequently, we propose a novel Tune mode to bridge the gap between Eval mode and Deploy mode. The proposed Tune mode is as stable as Eval mode for transfer learning, and its computational efficiency closely matches that of the Deploy mode. Through extensive experiments in object detection, classification, and adversarial example generation across $5$ datasets and $12$ model architectures, we demonstrate that the proposed Tune mode retains the performance while significantly reducing GPU memory footprint and training time, thereby contributing efficient ConvBN blocks for transfer learning and beyond. Our method has been integrated into both PyTorch (general machine learning framework) and MMCV/MMEngine (computer vision framework). Practitioners just need one line of code to enjoy our efficient ConvBN blocks thanks to PyTorch's builtin machine learning compilers.
format	Preprint
id	arxiv_https___arxiv_org_abs_2305_11624
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Efficient ConvBN Blocks for Transfer Learning and Beyond You, Kaichao Qin, Guo Bao, Anchang Cao, Meng Huang, Ping Shan, Jiulong Long, Mingsheng Artificial Intelligence Convolution-BatchNorm (ConvBN) blocks are integral components in various computer vision tasks and other domains. A ConvBN block can operate in three modes: Train, Eval, and Deploy. While the Train mode is indispensable for training models from scratch, the Eval mode is suitable for transfer learning and beyond, and the Deploy mode is designed for the deployment of models. This paper focuses on the trade-off between stability and efficiency in ConvBN blocks: Deploy mode is efficient but suffers from training instability; Eval mode is widely used in transfer learning but lacks efficiency. To solve the dilemma, we theoretically reveal the reason behind the diminished training stability observed in the Deploy mode. Subsequently, we propose a novel Tune mode to bridge the gap between Eval mode and Deploy mode. The proposed Tune mode is as stable as Eval mode for transfer learning, and its computational efficiency closely matches that of the Deploy mode. Through extensive experiments in object detection, classification, and adversarial example generation across $5$ datasets and $12$ model architectures, we demonstrate that the proposed Tune mode retains the performance while significantly reducing GPU memory footprint and training time, thereby contributing efficient ConvBN blocks for transfer learning and beyond. Our method has been integrated into both PyTorch (general machine learning framework) and MMCV/MMEngine (computer vision framework). Practitioners just need one line of code to enjoy our efficient ConvBN blocks thanks to PyTorch's builtin machine learning compilers.
title	Efficient ConvBN Blocks for Transfer Learning and Beyond
topic	Artificial Intelligence
url	https://arxiv.org/abs/2305.11624

Documenti analoghi