Relaxed Quantization for Discretized Neural Networks
release_voyi2ajybbdtllrpifcsgv6n4a
by
Christos Louizos, Matthias Reisser, Tijmen Blankevoort, Efstratios
Gavves, Max Welling
2018
Abstract
Neural network quantization has become an important research area due to its
great impact on deployment of large models on resource constrained devices. In
order to train networks that can be effectively discretized without loss of
performance, we introduce a differentiable quantization procedure.
Differentiability can be achieved by transforming continuous distributions over
the weights and activations of the network to categorical distributions over
the quantization grid. These are subsequently relaxed to continuous surrogates
that can allow for efficient gradient-based optimization. We further show that
stochastic rounding can be seen as a special case of the proposed approach and
that under this formulation the quantization grid itself can also be optimized
with gradient descent. We experimentally validate the performance of our method
on MNIST, CIFAR 10 and Imagenet classification.
In text/plain
format
Archived Files and Locations
application/pdf 1.1 MB
file_frxve63d4jhyxpnwkcv7fc7c5m
|
arxiv.org (repository) web.archive.org (webarchive) |
1810.01875v1
access all versions, variants, and formats of this works (eg, pre-prints)