Provably Faster Algorithms for Bilevel Optimization and Applications to Meta-Learning
release_whine4sxbfaszl6eu5d6jg4nfa
by
Kaiyi Ji, Junjie Yang, Yingbin Liang
2020
Abstract
Bilevel optimization has arisen as a powerful tool for many machine learning
problems such as meta-learning, hyper-parameter optimization, reinforcement
learning, etc. In this paper, we investigate the nonconvex-strongly-convex
bilevel optimization problem, and propose two novel algorithms named deterBiO
and stocBiO respectively for the deterministic and stochastic settings. At the
core design of deterBiO is the construction of a low-cost and easy-to-implement
hyper-gradient estimator via a simple back-propagation. In addition, stocBiO
updates with the mini-batch data sampling rather than the existing
single-sample schemes, where a sample-efficient Hessian inverse estimator is
proposed. We provide the finite-time convergence guarantee for both algorithms,
and show that they outperform the best known computational complexities
orderwisely with respect to the condition number κ and/or the target
accuracy ϵ. We further demonstrate the superior efficiency of the
proposed algorithms by the experiments on meta-learning and hyper-parameter
optimization.
In text/plain
format
Archived Files and Locations
application/pdf 1.3 MB
file_7hqxu2wvjfcwvi645mrlvm322u
|
arxiv.org (repository) web.archive.org (webarchive) |
2010.07962v1
access all versions, variants, and formats of this works (eg, pre-prints)