Machine Learning Academy
The Field Guide to Machine Learning, Lesson 3: Evaluation
The Facebook Field Guide to Machine Learning is a six-part video series developed by the Facebook ads machine learning team. The series shares best real-world practices and provides practical tips about how to apply machine-learning capabilities to real-world problems.
If you are interested in using machine learning to enhance your product in the real world, it’s important to understand how the entire development process works. It’s not only what happens during the training of your models, but everything that comes before and after, and how each step can either set you up for success or doom you to fail.
The Facebook ads machine learning team has developed a series of videos to help engineers and new researchers learn to apply their machine learning skills to real-world problems. The Facebook Field Guide to Machine Learning series breaks down the machine learning process into six steps:
1. Problem definition
In lesson’s one and two you learned to define the problem and prepare training data. Now that you have a clear understanding of the problem and have prepared your training data with a basic set of features, you’re ready to start lesson #3, evaluation.
Lesson 3: Evaluation. Before jumping into developing more features and iterating on model architectures, it’s important to have a clear plan for how to evaluate the performance of your model. Lesson three covers how to evaluate your approach.
In this lesson, we offer insights and recommendations for evaluation:
• Evaluation is made up of two things: the data you evaluate on, and the statistics you calculate. Both should aim to reflect the online use case as closely as possible
• Offline evaluation as first step, followed by online experimentation
• How to build a baseline model
• Approaches: random split and progressive evaluation
• Metrics: what statistics to use to measure performance
• Specificity: understand where the performance comes from
Offline exploration is best used for wide exploration, and the feedback loop between hypothesis to results must be kept as short as possible so that all ideas can be tried offline quickly.
Once promising results are obtained offline, move fast towards testing the candidate model or models on live traffic, which is the ultimate goal for evaluation.