Neural Architecture Optimization
release_eydrlukqrnflhkam3txfibjhki
by
Renqian Luo and Fei Tian and Tao Qin and Enhong Chen and Tie-Yan Liu
2018
Abstract
Automatic neural architecture design has shown its potential in discovering
powerful neural network architectures. Existing methods, no matter based on
reinforcement learning or evolutionary algorithms (EA), conduct architecture
search in a discrete space, which is highly inefficient. In this paper, we
propose a simple and efficient method to automatic neural architecture design
based on continuous optimization. We call this new approach neural architecture
optimization (NAO). There are three key components in our proposed approach:
(1) An encoder embeds/maps neural network architectures into a continuous
space. (2) A predictor takes the continuous representation of a network as
input and predicts its accuracy. (3) A decoder maps a continuous representation
of a network back to its architecture. The performance predictor and the
encoder enable us to perform gradient based optimization in the continuous
space to find the embedding of a new architecture with potentially better
accuracy. Such a better embedding is then decoded to a network by the decoder.
Experiments show that the architecture discovered by our method is very
competitive for image classification task on CIFAR-10 and language modeling
task on PTB, outperforming or on par with the best results of previous
architecture search methods with a significantly reduction of computational
resources. Specifically we obtain 2.11% test set error rate for CIFAR-10
image classification task and 56.0 test set perplexity of PTB language
modeling task. Furthermore, combined with the recent proposed weight sharing
mechanism, we discover powerful architecture on CIFAR-10 (with error rate
3.53%) and on PTB (with test set perplexity 56.6), with very limited
computational resources (less than 10 GPU hours) for both tasks.
In text/plain
format
Archived Files and Locations
application/pdf 660.4 kB
file_2ibymgrqhjh3pmli5qxcjehhh4
|
arxiv.org (repository) web.archive.org (webarchive) |
1808.07233v1
access all versions, variants, and formats of this works (eg, pre-prints)