机器学习中的整体分类器

科技2022-07-12 107

A group of items viewed as a whole rather than individually is called an ensemble. And by leveraging this technique in machine learning we can build models that can generalize the future data well in comparison to the individual classifier.

整体上而非整体上看待的一组项目称为整体。通过在机器学习中利用这种技术，我们可以建立模型，与单独的分类器相比，可以很好地概括未来的数据。

So, there are 4 different types of Ensemble techniques:

因此，有4种不同类型的Ensemble技术：

1. Voting Classifiers

1.投票分类器

2. Bagging and Pasting

2.套袋和粘贴

3. Boosting

3.提升

4. Stacking

4.堆叠

Let’s go over them one by one.

让我们一一介绍。

投票分类器 (Voting Classifier)

Voting classifiers, as the name implies, is a technique where the result of different classifiers is aggregated and predictions are made based on the class that gets the most votes. There are two types of Voting classifiers:

顾名思义，投票分类器是一种将不同分类器的结果进行汇总并根据获得最多选票的类别进行预测的技术。投票分类器有两种类型：

a. Hard Voting Classifier: The hard-voting classifier is a type where the voting is applied to the predicted class labels. Such that a class is predicted in favor of whoever has the highest number of votes. For Ex. Assume that there are three classifiers that classify a training sample:

一个。硬投票分类器：硬投票分类器是将投票应用于预测的类别标签的类型。这样一来，就可以预测一类人会投票最多的人。对于前假设有三个分类器对训练样本进行分类：

· Classifier 1 -> Class A

·分类器1-> A类

· Classifier 2 -> Class B

·分类器2-> B类

· Classifier 3 -> Class A

·分类器3-> A类

Via majority vote, we would classify the samples as “Class A”

通过多数表决，我们将样本归类为“ A类”

b. Soft Voting Classifier: However in soft voting, the predicted class is the one with the highest class probability, averaged over all the individual classifiers. For Ex. Assume there are three classifiers:

b。软投票分类器：但是，在软投票中，预测类别是分类概率最高的类别，在所有单个分类器上平均。对于前假设有三个分类器：

· Classifier 1 -> {0.40, 0.60)

·分类器1-> {0.40，0.60)

· Classifier 2 -> (0.47, 0.32)

·分类器2->(0.47，0.32)

· Classifier 3 -> (0.40, 0.53)

·分类器3->(0.40，0.53)

So, the probability for class A is (0.40 + 0.47 + 0.40)/3 which is 0.423 and the probability for class B is (0.60 + 0.32 + 0.53)/3 which is 0.4833. Based on the highest probability averaged by each classifier the predicted class is B. Let’s see the code:

因此，类别A的概率为(0.40 + 0.47 + 0.40)/ 3，即0.423，类别B的概率为(0.60 + 0.32 + 0.53)/ 3，即0.4833。根据每个分类器平均的最高概率，预测的类别为B。让我们看一下代码：

In the above code, we are using the digits dataset from the scikit-learn datasets library. And I have used hard voting here, so if you want to implement soft voting, then you need to change the voting parameter to soft in voting_clf.

在上面的代码中，我们使用了scikit-learn数据集库中的digits数据集。我在这里使用了硬投票，因此，如果要实现软投票，则需要在voting_clf中将投票参数更改为soft。

套袋和粘贴 (Bagging and Pasting)

Bagging and Pasting is one of the other ensembling technique. But here, rather than aggregating predictions from various models, we will aggregate the predictions of a single machine learning model trained over various random subsets of data.

套袋和粘贴是另一种组装技术。但是在这里，我们将汇总在各种随机数据子集上训练的单个机器学习模型的预测，而不是汇总来自各种模型的预测。

So, Bagging is also called Bootstrapping and it is a simple method of random sampling with replacement. Meaning that the instances can appear in the subset twice or more.

因此，装袋也称为自举，它是一种简单的随机抽样替换方法。意味着实例可以在子集中出现两次或更多次。

Whereas, Pasting is a technique of random sampling without replacement such that the instance cannot appear more than once in the same subset.

而“粘贴”是一种无需替换的随机抽样技术，因此该实例在同一子集中不能出现多次。

If we have to compare the results of Bagging and Pasting, then Bagging usually gives much better results than Pasting since it introduces a bit more diversity in the subsets that the predictor is trained on but it is always better to experiment with both as it also depends on the data you have.

如果我们必须比较“装袋”和“粘贴”的结果，那么“装袋”通常会比“粘贴”获得更好的结果，因为它在预测变量的训练子集中引入了更多的多样性，但同时进行试验总是更好，因为这也取决于根据您拥有的数据。

Let’s check the code below for Bagging and Pasting.

让我们检查下面的装袋和粘贴代码。

In order to make the classifier as Bagging or Pasting, we just have to change the value of the bootstrap parameter to True or False as this will decide whether you want the sampling with replacement or without replacement.

为了将分类器设置为装袋或粘贴，我们只需将bootstrap参数的值更改为True或False，因为这将决定您是要替换还是不替换采样。

助推 (Boosting)

Boosting is another ensemble technique that combines several weak classifiers into a strong classifier. The main idea behind boosting is to use the same predictor sequentially, such that with each sequence it tries to correct the mistakes of its predecessor. There are two main types of Boosting Algorithms:

Boosting是另一种集成技术，它将多个弱分类器组合为一个强分类器。 boosting背后的主要思想是顺序使用相同的预测变量，这样，对于每个序列，它都会尝试纠正其前任的错误。 Boosting算法有两种主要类型：

a. AdaBoost: The algorithm of AdaBoost is as follows:

一个。 AdaBoost ： AdaBoost的算法如下：

· Train a base classifier and make predictions of the training set

·训练基本分类器并预测训练集

· Compare the predictions with true labels and based on that increase the weights of misclassified instances

·将预测与真实标签进行比较，并以此为基础增加错误分类实例的权重

· Train a new classifier with updated weights and again check the predictions and update the weights of misclassified instances and so on.

·使用更新的权重训练新的分类器，然后再次检查预测并更新未分类实例的权重，依此类推。

b. Gradient Boosting: The Gradient Boosting algorithm is similar to the AdaBoost algorithm, the only difference is that in gradient boosting, the predictor tries to fit the residual error made by the predecessor, instead of tweaking instance weights.

b。梯度增强：梯度增强算法与AdaBoost算法相似，唯一的区别是，在梯度增强中，预测器会尝试拟合前任者产生的残差，而不是调整实例权重。

Check out the implementation of Ada Boost and Gradient Boosting below:

在下面查看Ada Boost和Gradient Boosting的实现：

There is also one more Boosting algorithm known as Extreme Gradient Boosting or XGBoost. It is quite famous and is extremely fast, scalable and portable. Please check the link to know more about XGBoost (https://xgboost.readthedocs.io/en/latest/#)

还有另外一种Boosting算法，称为极限梯度Boosting或XGBoost。它非常有名，并且非常快速，可扩展和便携式。请检查链接以了解有关XGBoost的更多信息( https://xgboost.readthedocs.io/en/latest/# )

堆码 (Stacking)

Stacking is the last part of the ensembling technique. Stacking is based on a simple idea of training a model to perform aggregation instead of using functions like hard-voting to aggregate the predictions of all classifiers in an ensemble.

堆叠是组装技术的最后一部分。堆叠基于训练模型执行聚合的简单思想，而不是使用诸如硬投票之类的功能来聚合集合中所有分类器的预测。

The model in stacking is also called as Blender. And the approach is to first split the dataset into training set and holdout set. The first subset is used to train predictors in the first layer. After that the first layers predictors are used to make predictions on the second (holdout set) to check the accuracy. Once that is done, the first layers prediction will now become the input for the blender.

堆叠中的模型也称为Blender。方法是首先将数据集分为训练集和保持集。第一个子集用于训练第一层中的预测变量。之后，使用第一层预测变量对第二个(保留集)进行预测，以检查准确性。一旦完成，第一层预测现在将成为搅拌机的输入。

It is also possible to train several different blenders in order to maximize the accuracy of the overall model.

为了使整个模型的准确性最大化，还可以训练几个不同的搅拌器。

Let’s check the code below.

让我们检查下面的代码。

The implementation of Stacking Classifier is super easy in scikit-learn. But if you are interested in building a stacking classifier from scratch, then I will suggest you to check out this link: https://machinelearningmastery.com/stacking-ensemble-machine-learning-with-python/

在scikit-learn中，Stacking Classifier的实现非常容易。但是，如果您有兴趣从头开始构建堆栈分类器，那么我建议您查看以下链接： https : //machinelearningmastery.com/stacking-ensemble-machine-learning-with-python/

This article is inspired by Aurélien Géron’s book Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. I have tried to simplify as much as possible so that it can help others. Please get the book from here: http://shop.oreilly.com/product/0636920052289.do

本文的灵感来自AurélienGéron的《使用Scikit-Learn和TensorFlow进行机器学习：构建智能系统的概念，工具和技术》。我试图尽可能简化，以便对他人有所帮助。请从此处获取该书： http : //shop.oreilly.com/product/0636920052289.do

翻译自: https://medium.com/the-innovation/ensemble-classifiers-in-machine-learning-dba7593a78fd

相关资源：微信小程序源码-合集6.rar

Processed: 0.012, SQL: 8