Table 1.1

Typical characteristics of federated learning settings vs. distributed learning in the datacenter (e.g., [33]). Cross-device and cross-silo federated learning are two examples of FL domains, but are not intended to be exhaustive. The primary defining characteristics of FL are highlighted in bold, but the other characteristics are also critical in determining which techniques are applicable

Datacenter Distributed LearningCross-Silo Federated LearningCross-Device Federated Learning
SettingTraining a model on a large but “flat” dataset. Clients are compute nodes in a single cluster or datacenter.Training a model on siloed data. Clients are different organizations (e.g., medical or financial) or geo-distributed datacenters.The clients are a very large number of mobile or IoT devices.
Data distributionData is centrally stored and can be shuffled and balanced across clients. Any client can read any part of the dataset.Data is generated locally and remains decentralized. Each client stores its own data and cannot read the data of other clients. Data is not independently or identically distributed.
OrchestrationCentrally orchestrated.A central orchestration server/service organizes the training, but never sees raw data.
Wide-area communicationNone (fully connected clients in one datacenter/cluster).Typically a hub-and-spoke topology, with the hub representing a coordinating service provider (typically without data) and the spokes connecting to clients.
Data availability————All clients are almost always available.————Only a fraction of clients are available at any one time, often with diurnal or other variations.
Distribution scaleTypically 1–1000 clients.Typically 2–100 clients.Massively parallel, up to 1010 clients.
Primary bottleneckComputation is more often the bottleneck in the datacenter, where very fast networks can be assumed.Might be computation or communication.Communication is often the primary bottleneck, though it depends on the task. Generally, cross-device federated computations use wi-fi or slower connections.
AddressabilityEach client has an identity or name that allows the system to access it specifically.Clients cannot be indexed directly (i.e., no use of client identifiers).
Client statefulnessStateful—each client may participate in each round of the computation, carrying state from round to round.Stateless—each client will likely participate only once in a task, so generally a fresh sample of never-before-seen clients in each round of computation is assumed.
Client reliabilityRelatively few failures.Highly unreliable—5% or more of the clients participating in a round of computation are expected to fail or drop out (e.g., because the device becomes ineligible when battery, network, or idleness requirements are violated).
Data partition axisData can be partitioned/re-partitioned arbitrarily across clients.Partition is fixed. Could be example-partitioned (horizontal) or feature-partitioned (vertical).Fixed partitioning by example (horizontal).

or Create an Account

Close Modal
Close Modal