Regression Losses-2 (Quick Revision)

Navaneeth Sharma
3 min readOct 9, 2021

Huber Loss and Adaptive Loss Revision (Some Explanation)

Photo by Susan Q Yin on Unsplash

Continuing the Revision series, Let’s move on to the Huber loss and Adaptive loss. (This is the Part-2 of the Regression Loss. If you missed reading the first part, that check out here. Also, if you want to revise the Machine Learning and Deep Learning Concepts, you can catch up with them here).

Let’s dive into Huber loss and Adaptive loss discuss its What? Why? When? As previously studied for MAE and MSE (Since it’s a revision, I am skipping “How?”, but it’s equally important).

Huber Loss

What?

Huber loss is a combination of mean squared loss, mean absolute loss functions. The General Equation for this can be given by

General Equation of Loss

Here, the δ is an hyperparameter for Huber loss. Also, it has the advantage of both MSE and MAE loss.

Why?

Though MAE is not differentiable at y=y_pred, the Huber loss attains the MSE curve in this region that is once it reaches below one, this will act as MSE. As Huber loss gets the shape of MAE for higher value, the model will be prune to outliers. The sample graph of MSE loss behavior is shown below.

When?

Huber loss can be useful when we need the balance of Mean squared error and mean absolute error. The MAE will completely ignore the outliers (even if it contains 20–30% of data), but Huber loss can prevent the outliers to some extent, but if the outliers are large it will make a balance.

Advantages

  • The Huber loss is effective when there are Outliers in data
  • The optimization is easy, as there are no non-differentiable points

Disadvantages

  • The equation is a bit complex, but we need to adjust the δ based on our requirement
  • The Adaptive Nature of the Huber loss is good, but the adaptivity can be improved

Adaptive Loss

There is a lot of research going on to make a loss more robust and adaptive from many years. Charbonnier loss, pseudo-Huber loss, generalized Charbonnier loss are some of the losses. One of the recent research is “A General and Adaptive Robust Loss Function” by Jonathan T. Barron from Google Research. This is a paper that explains the robust and adaptive loss in detail. I highly recommend you to read this paper.

The General Equation which was proposed in the paper is

Equation from the paper “A General and Adaptive Robust Loss Function” by Jonathan T. Barron

Here α and c are Hyper-parameters. . As you can see, it is a complex equation that handles almost all cases. The α and c need to be chosen based on your requirement. It is computationally expensive compared to all losses we have learned till now. The paper contains graphs and studies which explain the properties of this loss function. For further reading, you can go through this paper.

Amazing! We revised the Huber loss and got a basic understanding of the adaptive loss.

Thank you for your precious time. Let’s revise more concepts in the future. See you next time.

--

--