Mean Squared Error (MSE) Loss Function and Mean Absolute Error (MAE) Revision
Welcome back Again! Today we are going to revise the Loss Function concepts for Regression algorithms. This blog is part of the revision series, so I assume you have some background on Regression Algorithms and some optimization algorithms like Gradient Descent.
Regression is a term used for predicting the continuous signals or data, whether its speech signal, weather, etc. There are some popular approaches for getting the loss between predicted and actual.
- Mean Squared Error (MSE)
- Mean Absolute Error (MAE)
- Huber Loss
- Adaptive Loss
Majorly MSE and MAE are used based on applications. In this discussion, Let’s dive into MSE loss and MAE loss and discuss its What? Why? When? (Since it’s a revision, I am skipping “How?”, but it’s equally important).
Mean Squared Error
MSE is an error that takes the mean of the sum of the squares of the difference. The mathematical formulation of this is
As you can see, this is a simple equation but very powerful. It mainly gives the Residual loss. The Optimization becomes very easy using Mean Squared Error, this is one of the reason for adopting MSE widely. The sample graph of MSE loss behavior is given below
MSE is preferred to use when there are low outliers. This is one of the drawbacks of MSE. Using standardized data is efficient for better optimization using this loss.
- Simple Equation, Can optimize easily compared to other equations.
- Has proven its ability over the years, this is majorly used in fields like signal processing
- It is highly affected by Outliers. So if the data contains outliers, better not to use it.
Mean Absolute Error
MSE is an error that takes the mean of the sum of the absolute of the difference. The mathematical formulation of this is
The Equation is pretty simple as you can see, also this simple equation can prevent from the affect of outliers. Though the optimization of MAE is not that easy since it has a sharp edge at the bottom.( the differentiation is not simple for MOD functions, so the problem becomes little bit tricky. If you want to learn more you check out here). The sample graph of MAE loss behavior is given below
MAE is preferred to use when there is a chance of having outliers in the data. This is one of the Advantages of MAE. Using standardized data is efficient for better optimization using this loss.
- It is not much affected by outliers. So if the data contains outliers, better to use MAE as the loss.
- The Optimization is little bit complex compared to MSE
Awesome! This time we have gone through the Loss Functions majorly MSE and MAE.
Thank you for your time. Let’s meet next time with another exciting revision material.