Optimal Client Sampling for Federated Learning
release_hlrdoe6idbaajj4jvpzliew6du
by
Wenlin Chen, Samuel Horvath, Peter Richtarik
2021
Abstract
It is well understood that client-master communication can be a primary
bottleneck in Federated Learning. In this work, we address this issue with a
novel client subsampling scheme, where we restrict the number of clients
allowed to communicate their updates back to the master node. In each
communication round, all participating clients compute their updates, but only
the ones with "important" updates communicate back to the master. We show that
importance can be measured using only the norm of the update and give a formula
for optimal client participation. This formula minimizes the distance between
the full update, where all clients participate, and our limited update, where
the number of participating clients is restricted. In addition, we provide a
simple algorithm that approximates the optimal formula for client
participation, which only requires secure aggregation and thus does not
compromise client privacy. We show both theoretically and empirically that for
Distributed SGD (DSGD) and Federated Averaging (FedAvg), the performance of our
approach can be close to full participation and superior to the baseline where
participating clients are sampled uniformly. Moreover, our approach is
orthogonal to and compatible with existing methods for reducing communication
overhead, such as local methods and communication compression methods.
In text/plain
format
Archived Files and Locations
application/pdf 1.1 MB
file_czabda5thjcs5c6t72cr6l5e6m
|
arxiv.org (repository) web.archive.org (webarchive) |
2010.13723v2
access all versions, variants, and formats of this works (eg, pre-prints)