Asynchronous Stochastic Gradient Descent with Variance Reduction for
Non-Convex Optimization
release_3kblr552evgp5i4xu4dn4ndpp4
by
Zhouyuan Huo, Heng Huang
2016
Abstract
We provide the first theoretical analysis on the convergence rate of the
asynchronous stochastic variance reduced gradient (SVRG) descent algorithm on
non-convex optimization. Recent studies have shown that the asynchronous
stochastic gradient descent (SGD) based algorithms with variance reduction
converge with a linear convergent rate on convex problems. However, there is no
work to analyze asynchronous SGD with variance reduction technique on
non-convex problem. In this paper, we study two asynchronous parallel
implementations of SVRG: one is on a distributed memory system and the other is
on a shared memory system. We provide the theoretical analysis that both
algorithms can obtain a convergence rate of O(1/T), and linear speed up is
achievable if the number of workers is upper bounded. V1,v2,v3 have been
withdrawn due to reference issue, please refer the newest version v4.
In text/plain
format
Archived Files and Locations
application/pdf 444.7 kB
file_f5mwcv4uajgcvog74ovredifay
|
arxiv.org (repository) web.archive.org (webarchive) |
1604.03584v3
access all versions, variants, and formats of this works (eg, pre-prints)