Saved in:
Bibliographic Details
Main Authors: Finnveden, Lukas, Jansson, Ylva, Lindeberg, Tony
Format: Preprint
Published: 2020
Subjects:
Online Access:https://arxiv.org/abs/2001.05858
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909319377715200
author Finnveden, Lukas
Jansson, Ylva
Lindeberg, Tony
author_facet Finnveden, Lukas
Jansson, Ylva
Lindeberg, Tony
contents Spatial transformer networks (STNs) were designed to enable CNNs to learn invariance to image transformations. STNs were originally proposed to transform CNN feature maps as well as input images. This enables the use of more complex features when predicting transformation parameters. However, since STNs perform a purely spatial transformation, they do not, in the general case, have the ability to align the feature maps of a transformed image and its original. We present a theoretical argument for this and investigate the practical implications, showing that this inability is coupled with decreased classification accuracy. We advocate taking advantage of more complex features in deeper layers by instead sharing parameters between the classification and the localisation network.
format Preprint
id arxiv_https___arxiv_org_abs_2001_05858
institution arXiv
publishDate 2020
record_format arxiv
spellingShingle The problems with using STNs to align CNN feature maps
Finnveden, Lukas
Jansson, Ylva
Lindeberg, Tony
Computer Vision and Pattern Recognition
Machine Learning
Spatial transformer networks (STNs) were designed to enable CNNs to learn invariance to image transformations. STNs were originally proposed to transform CNN feature maps as well as input images. This enables the use of more complex features when predicting transformation parameters. However, since STNs perform a purely spatial transformation, they do not, in the general case, have the ability to align the feature maps of a transformed image and its original. We present a theoretical argument for this and investigate the practical implications, showing that this inability is coupled with decreased classification accuracy. We advocate taking advantage of more complex features in deeper layers by instead sharing parameters between the classification and the localisation network.
title The problems with using STNs to align CNN feature maps
topic Computer Vision and Pattern Recognition
Machine Learning
url https://arxiv.org/abs/2001.05858