# Regression Losses-1 (Quick Revision)

Mean Squared Error (MSE) Loss Function and Mean Absolute Error (MAE) Revision

Welcome back Again! Today we are going to revise the Loss Function concepts for Regression algorithms. This blog is part of the revision series, so I assume you have some background on Regression Algorithms and some optimization algorithms like Gradient Descent.

Regression is a term used for predicting the continuous signals or data, whether its speech signal, weather, etc. There are some popular approaches for getting the loss between predicted and actual.

1. Mean Squared Error (MSE)
2. Mean Absolute Error (MAE)
3. Huber Loss

Majorly MSE and MAE are used based on applications. In this discussion, Let’s dive into MSE loss and MAE loss and discuss its What? Why? When? (Since it’s a revision, I am skipping “How?”, but it’s equally important).

# Mean Squared Error

What ?

MSE is an error that takes the mean of the sum of the squares of the difference. The mathematical formulation of this is

Why?

As you can see, this is a simple equation but very powerful. It mainly gives the Residual loss. The Optimization becomes very easy using Mean Squared Error, this is one of the reason for adopting MSE widely. The sample graph of MSE loss behavior is given below

When?

MSE is preferred to use when there are low outliers. This is one of the drawbacks of MSE. Using standardized data is efficient for better optimization using this loss.

• Simple Equation, Can optimize easily compared to other equations.
• Has proven its ability over the years, this is majorly used in fields like signal processing

• It is highly affected by Outliers. So if the data contains outliers, better not to use it.

# Mean Absolute Error

What?

MSE is an error that takes the mean of the sum of the absolute of the difference. The mathematical formulation of this is

Why?

The Equation is pretty simple as you can see, also this simple equation can prevent from the affect of outliers. Though the optimization of MAE is not that easy since it has a sharp edge at the bottom.( the differentiation is not simple for MOD functions, so the problem becomes little bit tricky. If you want to learn more you check out here). The sample graph of MAE loss behavior is given below

When?

MAE is preferred to use when there is a chance of having outliers in the data. This is one of the Advantages of MAE. Using standardized data is efficient for better optimization using this loss.