Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Jansson, Ylva, Maydanskiy, Maksim, Finnveden, Lukas, Lindeberg, Tony
Format:	Preprint
Published:	2020
Subjects:	Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2004.14716
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912034140979200
author	Jansson, Ylva Maydanskiy, Maksim Finnveden, Lukas Lindeberg, Tony
author_facet	Jansson, Ylva Maydanskiy, Maksim Finnveden, Lukas Lindeberg, Tony
contents	A large number of deep learning architectures use spatial transformations of CNN feature maps or filters to better deal with variability in object appearance caused by natural image transformations. In this paper, we prove that spatial transformations of CNN feature maps cannot align the feature maps of a transformed image to match those of its original, for general affine transformations, unless the extracted features are themselves invariant. Our proof is based on elementary analysis for both the single- and multi-layer network case. The results imply that methods based on spatial transformations of CNN feature maps or filters cannot replace image alignment of the input and cannot enable invariant recognition for general affine transformations, specifically not for scaling transformations or shear transformations. For rotations and reflections, spatially transforming feature maps or filters can enable invariance but only for networks with learnt or hardcoded rotation- or reflection-invariant features
format	Preprint
id	arxiv_https___arxiv_org_abs_2004_14716
institution	arXiv
publishDate	2020
record_format	arxiv
spellingShingle	Inability of spatial transformations of CNN feature maps to support invariant recognition Jansson, Ylva Maydanskiy, Maksim Finnveden, Lukas Lindeberg, Tony Computer Vision and Pattern Recognition Machine Learning A large number of deep learning architectures use spatial transformations of CNN feature maps or filters to better deal with variability in object appearance caused by natural image transformations. In this paper, we prove that spatial transformations of CNN feature maps cannot align the feature maps of a transformed image to match those of its original, for general affine transformations, unless the extracted features are themselves invariant. Our proof is based on elementary analysis for both the single- and multi-layer network case. The results imply that methods based on spatial transformations of CNN feature maps or filters cannot replace image alignment of the input and cannot enable invariant recognition for general affine transformations, specifically not for scaling transformations or shear transformations. For rotations and reflections, spatially transforming feature maps or filters can enable invariance but only for networks with learnt or hardcoded rotation- or reflection-invariant features
title	Inability of spatial transformations of CNN feature maps to support invariant recognition
topic	Computer Vision and Pattern Recognition Machine Learning
url	https://arxiv.org/abs/2004.14716

Similar Items