Hyperparameter Optimization: A Spectral Approach
release_rfyd4jq7ivavzd7eawykwekdle
by
Elad Hazan, Adam Klivans, Yang Yuan
2017
Abstract
We give a simple, fast algorithm for hyperparameter optimization inspired by
techniques from the analysis of Boolean functions. We focus on the
high-dimensional regime where the canonical example is training a neural
network with a large number of hyperparameters. The algorithm --- an iterative
application of compressed sensing techniques for orthogonal polynomials ---
requires only uniform sampling of the hyperparameters and is thus easily
parallelizable.
Experiments for training deep neural networks on Cifar-10 show that compared
to state-of-the-art tools (e.g., Hyperband and Spearmint), our algorithm finds
significantly improved solutions, in some cases better than what is attainable
by hand-tuning. In terms of overall running time (i.e., time required to sample
various settings of hyperparameters plus additional computation time), we are
at least an order of magnitude faster than Hyperband and Bayesian Optimization.
We also outperform Random Search 8x.
Additionally, our method comes with provable guarantees and yields the first
improvements on the sample complexity of learning decision trees in over two
decades. In particular, we obtain the first quasi-polynomial time algorithm for
learning noisy decision trees with polynomial sample complexity.
In text/plain
format
Archived Files and Locations
application/pdf 517.5 kB
file_puqrh5irjndyjajlpg66nx6ozm
|
arxiv.org (repository) web.archive.org (webarchive) |
1706.00764v1
access all versions, variants, and formats of this works (eg, pre-prints)