Front. Appl. Math. Stat. | doi: 10.3389/fams.2018.00062

Randomized Distributed Mean Estimation: Accuracy vs Communication

  • 1University of Edinburgh, United Kingdom
  • 2Google (United States), United States
  • 3Moscow Institute of Physics and Technology, Russia

We consider the problem of estimating the arithmetic average of a finite collection of real vectors stored in a distributed fashion across several compute nodes subject to a communication budget constraint. Our analysis does not rely on any statistical assumptions about the source of the vectors. This problem arises as a subproblem in many applications, including reduce-all operations within algorithms for distributed and federated optimization and learning. We propose a flexible family of randomized algorithms exploring the trade-off between expected communication cost and estimation error. Our family contains the full-communication and zero-error method on one extreme, and an epsilon-bit communication and O(1/(epsilon n)) error method on the opposite extreme. In the special case where we communicate, in expectation, a single bit per coordinate of each vector, we improve upon existing results by obtaining O(r/n) error, where r is the number of bits used to represent a floating point value.

Keywords: Communication efficiency, Distributed mean estimation, Accuracy-communication tradeoff, Gradient compression, quantization

Received: 11 Oct 2018; Accepted: 28 Nov 2018.

Yiming Ying, University at Albany, United States

Shao-Bo Lin, Wenzhou University, China
Shiyin Qin, Beihang University, China  

PhD. Jakub Konečný, University of Edinburgh, Edinburgh, United Kingdom