Introduction to Federated Learning

machine learning Oct 6, 2021

Today, I'd like to introduce an idea you might not be familiar with: federated learning.

In Machine Learning, we often go back to these 13 types of learning:

  1. Supervised
  2. Unsupervised
  3. Semi-Supervised
  4. Self-Supervised
  5. Weakly-Supervised
  6. Reinforcement
  7. Meta Learning
  8. Multi-Task Learning
  9. Active Learning
  10. Continual
  11. Incremental
  12. Adversarial
  13. Federated

So what is Federated Learning? And where does it fit?

Federated Learning is a type of Machine Learning where we don't centralize all the data on a single server.

If we had to fit it somewhere, we would use the world Collaborative Learning.

It's especially great when we have sensitive data...

  • For example, when training models on Health Data from our phones, do we want our health data to be in a big data center along all other people's data? Absolutely not. The data stays on our phone.
  • When we use "Hey Siri!" or "Alexa", there is the same idea: our voice and everything we say to Siri never leaves our phone. Yet, the model is doing better everyday, with or without us using Siri.

Similarly, when we type on our keyboard and there are predictions running, but what we write never goes to Apple or Google's servers! But the models are doing better!


Hoes does Federated Learning work?

I found a picture on Google AI's blog, and I'll describe it to you:

At step 0, we have a basic model downloaded on all phones; this is the blue square you have at the very top.

1 - Training on Edge Devices

Let's start with A (on the left). We have a user on her phone, let's call her Stacy. And Stacy is saying all kinds of stuff to Siri. Everything she says, from "Hey Siri!" to "Can you play this music?" is going to be added to a local dataset; and the model is going to train on her phone with this dataset.

This is all done locally.

2 - Uploading the Models to the Cloud

Then, let's go to B (at the bottom). You can see Stacy's model (in green), along with all other people's models trained.

👉 These models are going to be added to the Cloud.

Yes, we upload the models, not the data!

3 - Downloading the Models on the Phone

All of that updates a supermodel that is going to be downloaded on Stacy and the other users's phones.

👉 Whether Stacy uses Siri or not doesn't matter, her phone is going to download the updates of everybody and make the experience better.

Which brings be to the Wikipedia's definition:

Federated learning (also known as collaborative learning) is a machine learning technique that trains an algorithm across multiple decentralized edge devices or servers holding local data samples, without exchanging them.

And here's a research paper on it.

I can only let you imagine what's possible when we collectively train a model in terms of privacy.

What if Federated Learning becomes the only accepted and tolerated way to train and use a model?

We can train models for self-driving cars only for a specific neighborhood, only thanks to the users of that neighborhood.

That leads to hyper personalization of a Machine Learning algorithm to a specific user, and make the experience 1,000 times better than if we just have a model working for everybody.

I hope you learned something today!

If you'd like to learn more think on cutting-edge Machine Learning, here are my daily emails where I teach Deep Learning, Computer Vision, and Self-Driving Cars for aspiring cutting-edge engineers!