An introductory Colab notebook showing how to deal with outliers in simple Machine Learning tasks.

Your company collected a dataset containing information about investments in commercials via TV, Radio and Newspapers. Given the amount of money invested in different advertisement media (TV, Radio, Newspaper) predict sales.

In this notebook we are going to meet outliers 😦 for the first time!

What are we going to do? 🤔

  1. Data import , analysis & preprocessing ⚙️
  2. Train an ML model
  3. Treat outliers
  4. Check performance degradation due to outliers

But wait… what actually is an outlier?

image

From Wikipedia: in statistics, an outlier is a data point that differs significantly from other observations.

For a better experience, open in Colab: open in colab