正则化算法
Understanding the use of Regularization algorithms like LASSO, Ridge, and Elastic-Net regression.
了解正则化算法(如LASSO,Ridge和Elastic-Net回归)的用法。
Before directly jumping into this article make sure you know the maths behind the Linear Regression algorithm. If you don’t, follow this article through!
在直接进入本文之前,请确保您了解线性回归算法背后的数学知识。 如果您不这样做,请继续阅读本文 !
Regularization is a technique used in regression to reduce the complexity of the model and to shrink the coefficients of the independent features.
正则化是一种用于回归的技术,可以降低模型的复杂性并缩小独立特征的系数。
“Everything should be made as simple as possible, but no simpler.” -Albert Einstein
“一切都应该尽可能简单 ,但不要简单。” -艾尔伯特爱因斯坦
In simple words, this technique converts a complex model into a simpler one, so as to avoid the risk of overfitting and shrinks the coefficients, for lesser computational cost.
简而言之,该技术将复杂的模型转换为更简单的模型,从而避免了过拟合的风险并缩小了系数,从而降低了计算成本。
The working of all these algorithms is quite similar to that of Linear Regression, it’s just the loss function that keeps on changing!
所有这些算法的工作原理与线性回归的工作原理非常相似,只是损失函数不断变化!
Loss Function for Linear Regression 线性回归的损失函数Ridge regression is a method for analyzing data that suffer from multi-collinearity.
Ridge回归是一种分析具有多重共线性的数据的方法。
Loss Function for Ridge Regression 岭回归的损失函数Ridge regression adds a penalty (L2 penalty) to the loss function that is equivalent to the square of the magnitude of the coefficients.
岭回归将损失(L2损失)加到损失函数,该损失等于系数幅度的平方。
The regularization parameter (λ) regularizes the coefficients such that if the coefficients take large values, the loss function is penalized.
正则化参数(λ)对系数进行正则化,使得如果系数取大值,则损失函数受到惩罚。
λ → 0, the penalty term has no effect, and the estimates produced by ridge regression will be equal to least-squares i.e. the loss function resembles the loss function of the Linear Regression algorithm. Hence, a lower value of λ will resemble a model close to the Linear regression model.
λ→0时 ,惩罚项没有影响,并且由岭回归产生的估计值将等于最小二乘,即损失函数类似于线性回归算法的损失函数。 因此,较低的λ值类似于线性回归模型的模型。
λ → ∞, the impact of the shrinkage penalty grows, and the ridge regression coefficient estimates will approach zero (coefficients are close to zero, but not zero).
λ→∞时,收缩损失的影响增大,并且岭回归系数估计将接近零 (系数接近零,但不为零)。
Note: Ridge regression is also known as the L2 Regularization.
注意:Ridge回归也称为 L2正则化。
To sum up, Ridge regression shrinks the coefficients as it helps to reduce the model complexity and multi-collinearity.
综上所述, Ridge回归缩小了系数,因为它有助于降低模型的复杂性和多重共线性。
Source 资源LASSO is a regression analysis method that performs both feature selection and regularization in order to enhance the prediction accuracy of the model.
LASSO是一种回归分析方法,可以同时执行特征选择和正则化,以提高模型的预测准确性。
Loss Function for LASSO Regression LASSO回归的损失函数LASSO regression adds a penalty (L1 penalty) to the loss function that is equivalent to the magnitude of the coefficients.
LASSO回归将损失(L1损失)加到损失函数,该损失等于系数的大小。
In LASSO regression, the penalty has the effect of forcing some of the coefficient estimates to be exactly equal to zero when the regularization parameter λ is sufficiently large.
在LASSO回归中,当正则化参数λ足够大时,惩罚具有迫使某些系数估计精确等于零的作用 。
Note: LASSO regression is also known as the L1 Regularization (L1 penalty).
注意:LASSO回归也称为 L1正则化(L1罚分)。
To sum up, LASSO regression converts coefficients of less important features to zero, which indeed helps in feature selection, and it shrinks the coefficients of remaining features to reduce the model complexity, hence avoiding overfitting.
综上所述, LASSO回归将次要特征的系数转换为零,这确实有助于特征选择,并且它缩小了剩余特征的系数以降低模型的复杂性,从而避免了过拟合。
Source 资源Elastic-Net is a regularized regression method that linearly combines the L1 and L2 penalties of the LASSO and Ridge methods respectively.
Elastic-Net是一种正则化回归方法,可以线性地分别组合LASSO方法和Ridge方法的L1和L2罚分。
Loss Function for Elastic-Net Regression 弹性网回归的损失函数A standard least-squares model tends to have some variance in it i.e. the model won’t generalize well for a data set different than its training data. Regularization, significantly reduces the variance of the model, without a substantial increase in its bias.
标准最小二乘模型倾向于在其中具有一定的方差,即,对于不同于其训练数据的数据集,该模型不能很好地推广。 正则化可显着减少模型的方差,而不会显着增加其偏差 。
So the regularization parameter λ, used in the techniques described above, controls the impact on bias and variance. As the value of λ rises, it reduces the value of coefficients and thus reducing the variance. This increase in λ is beneficial as it is only reducing the variance (hence avoiding overfitting), without losing any important properties in the data. But after a certain value, the model starts losing important properties, giving rise to bias in the model and thus underfits the data. Therefore, the value of λ should be carefully selected.
因此,在上述技术中使用的正则化参数λ控制对偏差和方差的影响。 随着λ值的增加,它减小了系数的值,从而减小了方差。 λ的增加是有益的,因为它仅减小了方差(因此避免了过拟合),而不会丢失数据中的任何重要属性。 但是在获得一定值之后,该模型开始失去重要的属性,从而导致模型中的偏差,从而使数据拟合不足。 因此,应仔细选择λ的值。
This is all the basic you will need, to get started with Regularization. It is a useful technique that can help in improving the accuracy of your regression models.
这是开始进行正则化所需的全部基本知识。 这是一项有用的技术,可以帮助提高回归模型的准确性。
翻译自: https://medium.com/analytics-vidhya/understanding-regularization-algorithms-450777fa0ed3
正则化算法