Table 3.3

Convergence rates for a (non-comprehensive) set of federated optimization methods in non-IID settings. We summarize the key assumptions for non-IID data, local functions on each client, and other assumptions. We also present the variant of the algorithm comparing to Federated Averaging and the convergence rates that eliminate constant

Non-IID Assumptions
SymbolFull NameExplanation
BCGVBounded inter-client gradient variance𝔼i ||∇fi(x) − ∇F(x)||2η2
BOBDBounded optimal objective differenceF*Ei[fi*]η2
BOGVBounded optimal gradient variance𝔼i ||∇fi(x*)||2η2
BGVBounded gradient dissimilarity𝔼i ||∇fi(x)||2/||∇F(x)||2η2
Other Assumptions and Variants
SymbolExplanation
CVXEach client function fi(x) is convex.
SCVXEach client function fi(x) is μ-strongly convex.
BNCVXEach client function has bounded nonconvexity with ∇2fi(x) ⪰ −μI.
BLGVThe variance of stochastic gradients on local clients is bounded.
BLGNThe norm of any local gradient is bounded.
LBGClients use the full batch of local samples to compute updates.
DecDecentralized setting, assumes the connectivity of network is good.
ACAll clients participate in each round.
1stepOne local update is performed on clients in each round.
ProxUse proximal gradient steps on clients.
VRVariance reduction which needs to track the state.
Convergence Rates
MethodNon-IIDOther AssumptionsVariantRate
Lian et al. [36]BCGVBLGVDec; AC; 1stepO(1/T)+O(1/NT)
PD-SGD [114]BCGVBLGVDec; ACO(N/T)+O(1/NT)
MATCHA [49]BCGVBLGVDecO(1/TKM)+O(M/KT)
Khaled et al. [115]BOGVCVXAC; LBGO(N/T)+O(1/NT)
Li et al. [116]BOBDSCVX; BLGV; BLGN-O(K/T)
FedProx [113]BGVBNCVXProxO(1/T)
SCAFFOLD [101]-SCVX; BLGVVRO(1/TKM) + O(e−T)

or Create an Account

Close Modal
Close Modal