TY - GEN
T1 - Optimizing Federated Averaging over Fading Channels
AU - Mu, Yujia
AU - Shen, Cong
AU - Eldar, Yonina C.
N1 - Publisher Copyright: © 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Deep fading represents the typical error event when communicating over wireless channels. We show that deep fading is particularly detrimental for federated learning (FL) over wireless communications. In particular, the celebrated FEDAVG and several of its variants break down for FL tasks when deep fading exists in the communication phase. The main contribution of this paper is an optimal global model aggregation method at the parameter server, which allocates different weights to different clients based on not only their learning characteristics but also the instantaneous channel state information at the receiver (CSIR). This is accomplished by first deriving an upper bound on the parallel stochastic gradient descent (SGD) convergence over fading channels, and then solving an optimization problem for the server aggregation weights that minimizes this upper bound. The derived optimal aggregation solution is closed-form, and achieves the well-known O(1/t) convergence rate for strongly-convex loss functions under arbitrary fading and decaying learning rates. We validate our approach using several real-world FL tasks.
AB - Deep fading represents the typical error event when communicating over wireless channels. We show that deep fading is particularly detrimental for federated learning (FL) over wireless communications. In particular, the celebrated FEDAVG and several of its variants break down for FL tasks when deep fading exists in the communication phase. The main contribution of this paper is an optimal global model aggregation method at the parameter server, which allocates different weights to different clients based on not only their learning characteristics but also the instantaneous channel state information at the receiver (CSIR). This is accomplished by first deriving an upper bound on the parallel stochastic gradient descent (SGD) convergence over fading channels, and then solving an optimization problem for the server aggregation weights that minimizes this upper bound. The derived optimal aggregation solution is closed-form, and achieves the well-known O(1/t) convergence rate for strongly-convex loss functions under arbitrary fading and decaying learning rates. We validate our approach using several real-world FL tasks.
UR - http://www.scopus.com/inward/record.url?scp=85136267220&partnerID=8YFLogxK
U2 - 10.1109/ISIT50566.2022.9834609
DO - 10.1109/ISIT50566.2022.9834609
M3 - منشور من مؤتمر
T3 - IEEE International Symposium on Information Theory - Proceedings
SP - 1277
EP - 1281
BT - 2022 IEEE International Symposium on Information Theory, ISIT 2022
T2 - 2022 IEEE International Symposium on Information Theory, ISIT 2022
Y2 - 26 June 2022 through 1 July 2022
ER -