Asynch-SGBDT: Asynchronous Parallel Stochastic Gradient Boosting
Decision Tree based on Parameters Server
release_qddpqdarszhexbufa7gxodjbuu
by
Cheng Daning, Xia Fen, Li Shigang, Zhang Yunquan
2019
Abstract
In AI research and industry, machine learning is the most widely used tool.
One of the most important machine learning algorithms is Gradient Boosting
Decision Tree, i.e. GBDT whose training process needs considerable
computational resources and time. To shorten GBDT training time, many works
tried to apply GBDT on Parameter Server. However, those GBDT algorithms are
synchronous parallel algorithms which fail to make full use of Parameter
Server. In this paper, we examine the possibility of using asynchronous
parallel methods to train GBDT model and name this algorithm as asynch-SGBDT
(asynchronous parallel stochastic gradient boosting decision tree). Our
theoretical and experimental results indicate that the scalability of
asynch-SGBDT is influenced by the sample diversity of datasets, sampling rate,
step length and the setting of GBDT tree. Experimental results also show
asynch-SGBDT training process reaches a linear speedup in asynchronous parallel
manner when datasets and GBDT trees meet high scalability requirements.
In text/plain
format
Archived Files and Locations
application/pdf 779.8 kB
file_i42ane4c7zc2jjhea2hizvw7i4
|
arxiv.org (repository) web.archive.org (webarchive) |
1804.04659v4
access all versions, variants, and formats of this works (eg, pre-prints)