HADFL: Heterogeneity-aware Decentralized Federated Learning Framework release_pzvrflrfbrde5a7qspp227d2r4

by Jing Cao, Zirui Lian, Weihong Liu, Zongwei Zhu, Cheng Ji

Released as a article .

2021  

Abstract

Federated learning (FL) supports training models on geographically distributed devices. However, traditional FL systems adopt a centralized synchronous strategy, putting high communication pressure and model generalization challenge. Existing optimizations on FL either fail to speedup training on heterogeneous devices or suffer from poor communication efficiency. In this paper, we propose HADFL, a framework that supports decentralized asynchronous training on heterogeneous devices. The devices train model locally with heterogeneity-aware local steps using local data. In each aggregation cycle, they are selected based on probability to perform model synchronization and aggregation. Compared with the traditional FL system, HADFL can relieve the central server's communication pressure, efficiently utilize heterogeneous computing power, and can achieve a maximum speedup of 3.15x than decentralized-FedAvg and 4.68x than Pytorch distributed training scheme, respectively, with almost no loss of convergence accuracy.
In text/plain format

Archived Files and Locations

application/pdf  1.5 MB
file_pxaq6ivogffqdmyv2guw3elfzm
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2021-11-16
Version   v1
Language   en ?
arXiv  2111.08274v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 4c5448f6-02d3-4e28-bf64-64e9ca60449f
API URL: JSON