cnn神经卷积网络
深度学习 (DEEP LEARNING)
为什么选择CNN? (Why CNN ?)
In Fully Connected Neural Networks any neuron in a given layer is fully connected to all the neurons in the previous layer so it consist of many parameters because of that they are more prone to over fitting.
在全连接神经网络中,给定层中的任何神经元都与上一层中的所有神经元完全连接,因此它由许多参数组成,因为它们更容易过度拟合。 As there is a huge chain of neurons, the model can suffer with a problem of vanishing gradients .
由于存在大量的神经元链,因此该模型可能会遇到梯度消失的问题。 What we aim is to get a better Optimization Algorithms, Better Activation Functions, Better Initialisation methods, Better Regularization.
我们的目标是获得更好的优化算法,更好的激活函数,更好的初始化方法,更好的正则化。
If we train DNNs on images with larger dimensions then we get a huge number of parameters which requires a lot of computation power.
如果我们在较大尺寸的图像上训练DNN,则会得到大量参数,这需要大量的计算能力。 Apparently we would like to have DNNs which are complex (many non-linearities) but have fewer parameters and hence less prone to over fitting.
显然,我们希望拥有复杂(许多非线性)但参数较少的DNN,因此不太容易出现过度拟合的情况。
图像看起来如何 (How Image Looks like)
https://media.geeksforgeeks.org/wp-content/uploads/RGB-1.jpg
https://media.geeksforgeeks.org/wp-content/uploads/RGB-1.jpg
The following figure represents a layer in a colored image. A colored image consist if three color channels red , green and blue . This is how actual image looks like internally, every colored images is a combination of three pixels.
下图表示彩色图像中的一层。 彩色图像由三个颜色通道(红色,绿色和蓝色)组成。 这就是实际图像内部的样子,每个彩色图像都是三个像素的组合。
卷积运算与神经网络 (Convolution operation vs Neural Network)
In CNN instead of taking the weighted sum of all the inputs in order to output one neuron we are taking the weighted sum of fewer inputs.
在CNN中,不是采用所有输入的加权和以输出一个神经元,而是采用较少输入的加权和。 In the first image of the given figure the box inside the input image is considered as a weight which is known as filter in CNN which will be multiplied with the given set of pixels in that particular portion of the image and then result the output as a pixel for the next layer.
在给定图形的第一个图像中,输入图像内的框被视为权重,在CNN中被称为过滤器,它将与该图像特定部分中的给定像素集相乘,然后将输出结果作为下一层的像素。 The filter then go through the entire image step by step and compute the resultant output.
然后,滤镜会逐步遍历整个图像,并计算结果输出。 We can have as many filters as we want in one layer, the width of the output will represent the no. of filters in the input image.
我们可以在一层中设置任意数量的滤镜,输出的宽度将表示否。 输入图像中的滤镜数量。
As you can clearly see in the image by applying some weights or filter in the image the image completely changed . In both the example we have applied edge detection filter, so only the part of the image where edges are present are highlighted and rest are completely black.
您可以通过在图像中施加一些权重或过滤器来清楚地看到图像,图像完全改变了。 在两个示例中,我们都应用了边缘检测滤镜,因此仅图像中存在边缘的部分被加亮,其余部分完全变为黑色。
https://www.guvi.in/
https://www.guvi.in/
In the image you can clearly see how the filter moves through the entire image. Note in this image the input is 3D, the filter is also 3D but the Convolution operation that we perform is 2D, we are sliding only vertically and horizontally buy not along with the depth, this is because the depth of the filter is the same as the depth of the input
在图像中,您可以清楚地看到滤镜如何在整个图像中移动。 请注意,在此图中,输入是3D,滤镜也是3D,但是我们执行的卷积运算是2D,我们只在垂直和水平方向滑动,而不随深度一起购买,这是因为滤镜的深度与输入的深度
Each filter applied to a 3D input will give a 2D output and combining multiple such filters will result in a 3D output.
应用于3D输入的每个滤镜将提供2D输出,将多个此类滤镜组合将产生3D输出。
一些术语 (Some Terminologies)
Input Width (WI ), Height (HI ) and Depth (DI )
输入宽度(WI),高度(HI)和深度(DI) Output Width (W0 ), Height (H0 ) and Depth (D0 )
输出宽度(W0),高度(H0)和深度(D0) The spatial extent of a filter (F), a single number to denote width and height as they are equal
过滤器(F)的空间范围,一个数字表示宽度和高度,因为它们相等 Filter depth is always the same as the Input Depth (DI )
过滤器深度始终与输入深度(DI)相同 The number of filters (K) f. Padding (P) and Stride (S)
过滤器数量(K)f。 填充(P)和步幅(S)
Padding :
填充:
As in the image we can see if we use 3x3 filter in a 7x7 input then we get a 5x5 output and we are not allowed to keep the kernel out of the input region, so every time we perform some operation we will loose some information of a image.
如图所示,我们可以看到如果在7x7输入中使用3x3过滤器,那么我们将获得5x5输出,并且我们不允许将内核保留在输入区域之外,因此,每次执行某些操作时,我们都会丢失一些信息图片。 So the solution for that is padding, we create an extra layer at the end of the input and then we can slide out filter through it, so we will get the output same as input this is how we will save some information.
因此,解决方案是填充,我们在输入的末尾创建一个额外的层,然后可以通过它滑出过滤器,这样我们将获得与输入相同的输出,这就是我们将如何保存一些信息的方法。
Stride :
步幅:
Stride defines the interval at which the filter is applied . Higher the stride, the smaller the size of the output .
步幅定义了应用过滤器的间隔。 步幅越高,输出的大小越小。 In other words the movement of filter over the input is defined by stride ,if the stride is 1 we will move one step horizontally and vertically over the image similarly if the stride is 2 we will move 2 steps horizontally and vertically over the image .
换句话说,滤波器在输入上的移动由步幅定义,如果步幅为1,我们将在图像上水平和垂直移动一步,类似地,如果步幅为2,我们将在图像上水平和垂直移动2步。
最大池 (Max Pooling)
Max Pooling operation
最大池化操作
We are familiar with almost all the layers in this architecture except the Max Pooling layer
除了“最大缓冲”层,我们几乎熟悉此体系结构中的所有层 Here, by passing the filter over an image (with or without padding), we get a transformed matrix of values
在这里,通过将滤镜传递到图像上(带或不带填充),我们得到了一个转换后的值矩阵 Now, we perform max-pooling over the convoluted input to select the max-value from each position of the kernel, as specified by stride length.
现在,我们对卷积输入执行最大池化,以根据步幅长度从内核的每个位置选择最大值。 Here, we select a stride length of 2 and a 2x2 filter, meaning the 4x4 convoluted output is split into 4 quadrants.
在这里,我们选择步长为2的滤波器和2x2的滤波器,这意味着将4x4的卷积输出分成4个象限。 The max value of each of these quadrants is taken and a 2x2 matrix is generated.
取每个象限的最大值,并生成2x2矩阵。
Max pooling is done to select the most prominent or salient point within a neighborhood. It is also known as sub sampling, as we are sampling just a single value from a region.
进行最大池化以选择邻域内最突出或最显着的点。 这也称为子采样,因为我们仅从区域中采样单个值。
Similar to Max pooling, average pooling is also done sometimes and it’s carried out by taking the average value in a sampled neighborhood .
与最大池化类似,有时也会进行平均池化 ,它是通过在采样邻域中取平均值来进行的。
The idea behind Max Pooling is to condense the convolutional input into a smaller size, thereby making it easier to manage.
Max Pooling的思想是将卷积输入压缩为较小的大小,从而使其更易于管理。
全卷积神经网络 (Full Convolutional Neural Network)
LeNet Architecture
LeNet架构
The following diagram illustrates the configuration and working of a Convolutional Neural Network. It follows the LeNet architecture, created by Yann LeCun.
下图说明了卷积神经网络的配置和工作。 它遵循Yann LeCun创建的LeNet架构。 For a change the input image takes 32x32x1 pixel inputs as there is no depth component because the images are in black & white .
对于更改,输入图像采用32x32x1像素输入,因为没有深度分量,因为图像为黑白。 At the end it consist of 2 fully connected layers; adding Fully-Connected layer is usually a cheap way of learning non-linear combinations of the high-level features as represented by the output of the convolutional layer.
最后,它由2个完全连接的层组成; 添加全连接层通常是学习卷积层输出所表示的高级特征的非线性组合的廉价方法。 Fully connected layer 1: Number of neurons: 120, Input is h4 flattened, i.e. 5x5x16 = 400, No. of parameters in h5 = 120x400 + 120-bias = 48120 parameters .
完全连接的层1:神经元数:120,输入被h4展平,即5x5x16 = 400,h5中的参数数量= 120x400 + 120偏置= 48120参数。 Fully connected layer 2: Number of neurons: 84, Input is number of neurons in h5 = 120, No. of parameters in h6 = 84x120 + 84-bias = 10164 parameters .
完全连接的层2:神经元数:84,输入为h5中的神经元数= 120,h6中的参数数= 84x120 + 84-bias = 10164参数。
Note : Overall, this combination of Convolutional and fully-connected layers is much more efficient than an entirely fully connected network. It has a significantly lower number of parameters but still is able to estimate functions of very high complexity.
注意:总体而言, 卷积 层和完全连接层的这种组合比完全完全连接的网络要有效得多。 它的参数数量明显减少,但仍然能够估算出非常复杂的功能。
训练卷积神经网络 (Train a Convolution Neural Network)
https://www.guvi.in/
https://www.guvi.in/
A CNN can be implemented as a feedforward network wherein only a few weights(coloured) are active
CNN可以实现为前馈网络,其中只有少数权重(彩色)有效 The rest of the weights (grey) remain zero.Thus, we can train a CNN using backpropagation by thinking of it as a Feed Forward Neural Network with sparse connections .
其余权重(灰色)保持为零。因此,我们可以通过将反向传播视为具有稀疏连接的前馈神经网络来训练CNN。
Note : However, in practice we don’t do this, as most of the weights in the matrix end up being zero. Frameworks like PyTorch and Tensorflow don’t end up creating such large matrices and only focus on dealing with the weights that are to be updated.
注意:但是,实际上我们不这样做,因为矩阵中的大多数权重最终都为零。 像PyTorch和Tensorflow这样的框架最终并不会创建这么大的矩阵,而只专注于处理要更新的权重。
References :
参考文献:
https://www.guvi.in/ (AI with Deep Learning certification course)
https://www.guvi.in/ (具有深度学习认证课程的AI)
https://www.deeplearning.ai/ deep-learning-specialization/
https://www.deeplearning.ai/ deep-learning-specialization /
Hands on Machine Learning with Scikit (Reference Book , PDF -https://drive.google.com/file/d/16DdwF4KIGi47ky7Q_B-4aApvMYW2evJZ/view?usp=sharing) .
使用Scikit进行机器学习(参考书,PDF- https: //drive.google.com/file/d/16DdwF4KIGi47ky7Q_B-4aApvMYW2evJZ/view?usp=sharing)。
翻译自: https://medium.com/analytics-vidhya/cnn-convolution-neural-network-17cc89802234
cnn神经卷积网络
相关资源:flattened-cnn, 平坦卷积神经网络( Torch nn的一维卷积模块).zip