Snapshot Ensembles: Train 1, get M for free
release_653mnwhqbzfg7hapzygzc6rkqa
by
Gao Huang, Yixuan Li, Geoff Pleiss, Zhuang Liu, John E. Hopcroft,
Kilian Q. Weinberger
2017
Abstract
Ensembles of neural networks are known to be much more robust and accurate
than individual networks. However, training multiple deep networks for model
averaging is computationally expensive. In this paper, we propose a method to
obtain the seemingly contradictory goal of ensembling multiple neural networks
at no additional training cost. We achieve this goal by training a single
neural network, converging to several local minima along its optimization path
and saving the model parameters. To obtain repeated rapid convergence, we
leverage recent work on cyclic learning rate schedules. The resulting
technique, which we refer to as Snapshot Ensembling, is simple, yet
surprisingly effective. We show in a series of experiments that our approach is
compatible with diverse network architectures and learning tasks. It
consistently yields lower error rates than state-of-the-art single models at no
additional training cost, and compares favorably with traditional network
ensembles. On CIFAR-10 and CIFAR-100 our DenseNet Snapshot Ensembles obtain
error rates of 3.4% and 17.4% respectively.
In text/plain
format
Archived Files and Locations
application/pdf 1.2 MB
file_4yvc6jzxw5gvze5t6zz75k2vlu
|
arxiv.org (repository) web.archive.org (webarchive) |
1704.00109v1
access all versions, variants, and formats of this works (eg, pre-prints)