Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Saha, Sourajit, Gokhale, Tejas
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2404.07410
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929609105211392
author	Saha, Sourajit Gokhale, Tejas
author_facet	Saha, Sourajit Gokhale, Tejas
contents	Downsampling operators break the shift invariance of convolutional neural networks (CNNs) and this affects the robustness of features learned by CNNs when dealing with even small pixel-level shift. Through a large-scale correlation analysis framework, we study shift invariance of CNNs by inspecting existing downsampling operators in terms of their maximum-sampling bias (MSB), and find that MSB is negatively correlated with shift invariance. Based on this crucial insight, we propose a learnable pooling operator called Translation Invariant Polyphase Sampling (TIPS) and two regularizations on the intermediate feature maps of TIPS to reduce MSB and learn translation-invariant representations. TIPS can be integrated into any CNN and can be trained end-to-end with marginal computational overhead. Our experiments demonstrate that TIPS results in consistent performance gains in terms of accuracy, shift consistency, and shift fidelity on multiple benchmarks for image classification and semantic segmentation compared to previous methods and also leads to improvements in adversarial and distributional robustness. TIPS results in the lowest MSB compared to all previous methods, thus explaining our strong empirical results.
format	Preprint
id	arxiv_https___arxiv_org_abs_2404_07410
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling Saha, Sourajit Gokhale, Tejas Computer Vision and Pattern Recognition Machine Learning Downsampling operators break the shift invariance of convolutional neural networks (CNNs) and this affects the robustness of features learned by CNNs when dealing with even small pixel-level shift. Through a large-scale correlation analysis framework, we study shift invariance of CNNs by inspecting existing downsampling operators in terms of their maximum-sampling bias (MSB), and find that MSB is negatively correlated with shift invariance. Based on this crucial insight, we propose a learnable pooling operator called Translation Invariant Polyphase Sampling (TIPS) and two regularizations on the intermediate feature maps of TIPS to reduce MSB and learn translation-invariant representations. TIPS can be integrated into any CNN and can be trained end-to-end with marginal computational overhead. Our experiments demonstrate that TIPS results in consistent performance gains in terms of accuracy, shift consistency, and shift fidelity on multiple benchmarks for image classification and semantic segmentation compared to previous methods and also leads to improvements in adversarial and distributional robustness. TIPS results in the lowest MSB compared to all previous methods, thus explaining our strong empirical results.
title	Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling
topic	Computer Vision and Pattern Recognition Machine Learning
url	https://arxiv.org/abs/2404.07410

Similar Items