Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wang, Zhenyu, Song, Heng
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2502.18083
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913709761232896
author	Wang, Zhenyu Song, Heng
author_facet	Wang, Zhenyu Song, Heng
contents	The identification of artwork is crucial in areas like cultural heritage protection, art market analysis, and historical research. With the advancement of deep learning, Convolutional Neural Networks (CNNs) and Transformer models have become key tools for image classification. While CNNs excel in local feature extraction, they struggle with global context, and Transformers are strong in capturing global dependencies but weak in fine-grained local details. To address these challenges, this paper proposes a fusion model combining CNNs and Transformers for artwork identification. The model first extracts local features using CNNs, then captures global context with a Transformer, followed by a feature fusion mechanism to enhance classification accuracy. Experiments on Chinese and oil painting datasets show the fusion model outperforms individual CNN and Transformer models, improving classification accuracy by 9.7% and 7.1%, respectively, and increasing F1 scores by 0.06 and 0.05. The results demonstrate the model's effectiveness and potential for future improvements, such as multimodal integration and architecture optimization.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_18083
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	A Fusion Model for Artwork Identification Based on Convolutional Neural Networks and Transformers Wang, Zhenyu Song, Heng Computer Vision and Pattern Recognition The identification of artwork is crucial in areas like cultural heritage protection, art market analysis, and historical research. With the advancement of deep learning, Convolutional Neural Networks (CNNs) and Transformer models have become key tools for image classification. While CNNs excel in local feature extraction, they struggle with global context, and Transformers are strong in capturing global dependencies but weak in fine-grained local details. To address these challenges, this paper proposes a fusion model combining CNNs and Transformers for artwork identification. The model first extracts local features using CNNs, then captures global context with a Transformer, followed by a feature fusion mechanism to enhance classification accuracy. Experiments on Chinese and oil painting datasets show the fusion model outperforms individual CNN and Transformer models, improving classification accuracy by 9.7% and 7.1%, respectively, and increasing F1 scores by 0.06 and 0.05. The results demonstrate the model's effectiveness and potential for future improvements, such as multimodal integration and architecture optimization.
title	A Fusion Model for Artwork Identification Based on Convolutional Neural Networks and Transformers
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2502.18083

Similar Items