X-TrainCaps: Accelerated Training of Capsule Nets through Lightweight
Software Optimizations
release_u5gizcyijjc75lczlck3hfjgxa
by
Alberto Marchisio, Beatrice Bussolino, Alessio Colucci, Muhammad
Abdullah Hanif, Maurizio Martina, Guido Masera, Muhammad Shafique
2019
Abstract
Convolutional Neural Networks (CNNs) are extensively in use due to their
excellent results in various machine learning (ML) tasks like image
classification and object detection. Recently, Capsule Networks (CapsNets) have
shown improved performances compared to the traditional CNNs, by encoding and
preserving spatial relationships between the detected features in a better way.
This is achieved through the so-called Capsules (i.e., groups of neurons) that
encode both the instantiation probability and the spatial information. However,
one of the major hurdles in the wide adoption of CapsNets is its gigantic
training time, which is primarily due to the relatively higher complexity of
its constituting elements. In this paper, we illustrate how can we devise new
optimizations in the training process to achieve fast training of CapsNets, and
if such optimizations affect the network accuracy or not. Towards this, we
propose a novel framework "X-TrainCaps" that employs lightweight software-level
optimizations, including a novel learning rate policy called WarmAdaBatch that
jointly performs warm restarts and adaptive batch size, as well as weight
sharing for capsule layers to reduce the hardware requirements of CapsNets by
removing unused/redundant connections and capsules, while keeping high accuracy
through tests of different learning rate policies and batch sizes. We
demonstrate that one of the solutions generated by X-TrainCaps framework can
achieve 58.6% training time reduction while preserving the accuracy (even 0.9%
accuracy improvement), compared to the CapsNet in the original paper by Sabour
et al. (2017), while other Pareto-optimal solutions can be leveraged to realize
trade-offs between training time and achieved accuracy.
In text/plain
format
Archived Files and Locations
application/pdf 547.9 kB
file_eqfqzm3fsfcurg5e3p2q7gdjbq
|
arxiv.org (repository) web.archive.org (webarchive) |
1905.10142v1
access all versions, variants, and formats of this works (eg, pre-prints)