Asynchronous Parallel Stochastic Gradient Descent - A Numeric Core for Scalable Distributed Machine Learning Algorithms release_6vgibcnkiraulg6mh47mzp3c24

by Janis Keuper, Franz-Josef Pfreundt

Released as a article .

2015  

Abstract

The implementation of a vast majority of machine learning (ML) algorithms boils down to solving a numerical optimization problem. In this context, Stochastic Gradient Descent (SGD) methods have long proven to provide good results, both in terms of convergence and accuracy. Recently, several parallelization approaches have been proposed in order to scale SGD to solve very large ML problems. At their core, most of these approaches are following a map-reduce scheme. This paper presents a novel parallel updating algorithm for SGD, which utilizes the asynchronous single-sided communication paradigm. Compared to existing methods, Asynchronous Parallel Stochastic Gradient Descent (ASGD) provides faster (or at least equal) convergence, close to linear scaling and stable accuracy.
In text/plain format

Archived Files and Locations

application/pdf  506.0 kB
file_i4bex6qbfrhwtlm62vtb6ftxci
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2015-05-19
Version   v1
Language   en ?
arXiv  1505.04956v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 22592191-f230-4288-b110-960d1e834d70
API URL: JSON