Regressions are one of the first algorithms that we learn in Machine Learning and Data Science. Whether it is Linear regression or Logistic regression all of you may have used at least one of them before coming to this article. But do you know there many more regression techniques.
For complex use cases we cannot simple use the basic regression techniques, we need more sophisticated algorithms. Today we will look at different types of regressions that you mush know and find out their strength and weaknesses.
By the end of this article, you will come to know about new regression techniques that you may use in your next project.
Regression is the most basic and fundamental regression model under regression techniques. Simple Linear Regression is used to model the relationship between a input independent variable and an output dependent variable using a linear model. We try to find a line that best fits the model.
Y = aX + b
When a linear model is create with multiple independent variable and an output variable, we call it Multi-Variate Linear Regression.
Y = a1X1 + a2X2 + a3X3.....an-1Xn-1 + anXn + b
As the name suggests it is a linear mapping, the above formulas do not contain any non-linearity. Hence we can only use them in scenarios where the independent and dependent variables are linearly dependent.
For more complex relationships we may have to look for other algorithms.
- Linear Regression is simple and easy to understand.
- Sensitive to outliers if they are not handled before training.
Polynomial Regression comes to our rescue when we want to fit a model for the not linear data. In this case the model created, fits the data points to a curve and not a straight line. So for such regression the formula will be of a curve. Some of the independent variable will have power more than 1.
Y = a1*X1 + a2*X2² + a3*X3⁴.....an*Xn + b
For such regression selecting the exponent for each variable requires good knowledge of the dataset.
- Polynomial Regression can model non linear data and other complex relationships.
- Need to carefully select exponents or may lead to over fitting.
Ridge Regression is another variation of linear regression. It is a method of regularization that helps mitigate the problem of multi-collinearity in regression. Multi-collinearity is a common problem in models with large no of parameters.
Due to multicollinearity, least squares estimates are unbiased, but the variances are large. Now by adding a degree of bias to the regression estimates, ridge regression reduces the standard errors.
The added bias pull the feature variable coefficient away from the rigidness and greatly reduces variance. So in Ridge Regression also we add a small squared bias factor to the variable and reduce variance.
Simple Linear Regression optimization function:
min || Xw - Y ||²
Ridge Regression introducing bias:
min || Xw — Y ||² + z|| w ||²
Lasso Regression is similar to the regularization technique Ridge Regression. It performs both the variable selection and the regularization which enhance the prediction accuracy.
In Lasso Regression also we add small bias factor to reduce variance. But here we add an absolute bias and not squared bias.
min || Xw — Y ||² + z|| w ||
Adding an absolute bias leads to penalizing values which causes few of the parameter estimates to turn out to be zero. Larger the penalty, more the estimates get shrunk towards absolute zero. This helps in variable selection out of the n variables.
ElasticNet Regression is another type of regularization technique that combines both the L1 and L2 regularization of the Lasso and Ridge Regression method respectively.
min || Xw — Y ||² + z1|| w || + z2|| w ||²
Congratulations! If you made it to this point, you now know 5 different types of Regression.
That’s it for this post, thank you for reading. Follow our website for more such articles.