Exploiting Invariance in Training Deep Neural Networks release_qrswp3arlffxti4rvuph6tud6e

by Chengxi Ye, Xiong Zhou, Tristan McKinney, Yanfeng Liu, Qinggang Zhou, Fedor Zhdanov

Released as a article .

2021  

Abstract

Inspired by two basic mechanisms in animal visual systems, we introduce a feature transform technique that imposes invariance properties in the training of deep neural networks. The resulting algorithm requires less parameter tuning, trains well with an initial learning rate 1.0, and easily generalizes to different tasks. We enforce scale invariance with local statistics in the data to align similar samples at diverse scales. To accelerate convergence, we enforce a GL(n)-invariance property with global statistics extracted from a batch such that the gradient descent solution should remain invariant under basis change. Profiling analysis shows our proposed modifications takes 5% of the computations of the underlying convolution layer. Tested on convolutional networks and transformer networks, our proposed technique requires fewer iterations to train, surpasses all baselines by a large margin, seamlessly works on both small and large batch size training, and applies to different computer vision and language tasks.
In text/plain format

Archived Content

There are no accessible files associated with this release. You could check other releases for this work for an accessible version.

"Dark" Preservation Only
Save Paper Now!

Know of a fulltext copy of on the public web? Submit a URL and we will archive it

Type  article
Stage   submitted
Date   2021-12-09
Version   v2
Language   en ?
arXiv  2103.16634v2
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 4b06c650-fee8-4a46-949b-589b84bcb130
API URL: JSON