A Fast Anderson-Chebyshev Mixing Method for Nonlinear Optimization release_brs7lhrrrvdytln2iuctpuwhje

by Zhize Li, Jian Li

Released as a article .

2018  

Abstract

Anderson mixing (or Anderson acceleration) is an efficient acceleration method for fixed point iterations x_t+1=G(x_t), e.g., gradient descent can be viewed as iteratively applying the operation G(x) = x-α∇ f(x). It is known that Anderson mixing is quite efficient in practice and can be viewed as an extension of Krylov subspace methods for nonlinear problems. In this paper, we show that Anderson mixing with Chebyshev polynomial parameters can achieve the optimal convergence rate O(√(κ)1/ϵ), which improves the previous result O(κ1/ϵ) provided by [Toth and Kelley, 2015] for quadratic functions. Then, we provide a convergence analysis for minimizing general nonlinear problems. Besides, if the hyperparameters (e.g., the Lipschitz smooth parameter L) are not available, we propose a guessing algorithm for guessing them dynamically and also prove a similar convergence rate. Finally, the experimental results demonstrate that the proposed Anderson-Chebyshev mixing method converges significantly faster than other algorithms, e.g., vanilla gradient descent (GD), Nesterov's Accelerated GD. Also, these algorithms combined with the proposed guessing algorithm (guessing the hyperparameters dynamically) achieve much better performance.
In text/plain format

Archived Files and Locations

application/pdf  977.6 kB
file_j5elhmvatnfubehd3gfmvxwsom
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2018-09-07
Version   v1
Language   en ?
arXiv  1809.02341v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 82445ee5-5a74-4f93-bbcc-a6d40466fb4d
API URL: JSON