Asynch-SGBDT: Asynchronous Parallel Stochastic Gradient Boosting Decision Tree based on Parameters Server release_qddpqdarszhexbufa7gxodjbuu

by Cheng Daning, Xia Fen, Li Shigang, Zhang Yunquan

Released as a article .

2019  

Abstract

In AI research and industry, machine learning is the most widely used tool. One of the most important machine learning algorithms is Gradient Boosting Decision Tree, i.e. GBDT whose training process needs considerable computational resources and time. To shorten GBDT training time, many works tried to apply GBDT on Parameter Server. However, those GBDT algorithms are synchronous parallel algorithms which fail to make full use of Parameter Server. In this paper, we examine the possibility of using asynchronous parallel methods to train GBDT model and name this algorithm as asynch-SGBDT (asynchronous parallel stochastic gradient boosting decision tree). Our theoretical and experimental results indicate that the scalability of asynch-SGBDT is influenced by the sample diversity of datasets, sampling rate, step length and the setting of GBDT tree. Experimental results also show asynch-SGBDT training process reaches a linear speedup in asynchronous parallel manner when datasets and GBDT trees meet high scalability requirements.
In text/plain format

Archived Files and Locations

application/pdf  779.8 kB
file_i42ane4c7zc2jjhea2hizvw7i4
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2019-07-18
Version   v4
Language   en ?
arXiv  1804.04659v4
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 831d6359-a0fa-49a8-8bac-8fc20e64138a
API URL: JSON