In this blog, I will outline how to build a reliable image classification model using a convolutional neural network to detect the presence of pneumonia from chest X-ray images.
在此博客中,我将概述如何使用卷积神经网络建立可靠的图像分类模型,以从胸部X射线图像检测肺炎的存在。
Pneumonia is a common infection that inflames the air sacs in the lungs causing symptoms such as difficulty breathing and fever. Even though pneumonia is not difficult to treat, a timely diagnosis is crucial. Without proper treatment, pneumonia can become fatal especially among children and elders. Chest X-ray is an affordable method for the diagnosis of pneumonia. Developing a model that can reliably classify pneumonia from X-ray images can alleviate a load of physicians in the areas where the demand is high.
肺炎是一种常见的感染,会感染肺中的气囊,导致呼吸困难和发烧等症状。 即使不难治疗肺炎,及时诊断也至关重要。 没有适当的治疗,肺炎可能会致命,尤其是在儿童和老年人中。 胸部X光检查是诊断肺炎的一种经济实惠的方法。 开发可以从X射线图像可靠地对肺炎进行分类的模型可以减轻需求量大的地区的医生负担。
Kermany and his colleagues at UCSD took the initiative to identify diseases based on the chest X-rays and Optical Coherence Tomography scans using deep learning. We used chest X-ray images provided in their study as our dataset.
加州大学圣地亚哥分校的Kermany和他的同事们主动使用胸部X射线和光学相干断层扫描技术,通过深度学习来识别疾病。 我们使用他们研究中提供的胸部X射线图像作为数据集。
A data folder should be structured as below.
数据文件夹的结构应如下。
DATA│├── train│ ├── NORMAL│ └── PNEUMONIA│├── test│ ├── NORMAL│ └── PNEUMONIA│└── validation ├── NORMAL └── PNEUMONIAAfter removing image files without proper encodings, we had 5,639 files in our data set, and we used 15% of these images as our validation set and the other 15% as a testing set. Our final training set included 1,076 normal cases and 2,873 cases of pneumonia.
在删除了没有正确编码的图像文件之后,我们的数据集中有5,639个文件,我们将其中15%的图像用作验证集,将其余15%的图像用作测试集。 我们的最终训练集包括1,076例正常病例和2,873例肺炎。
If you are interested in the steps to run exploratory data analysis on image data, please see my previous post.
如果您对对图像数据进行探索性数据分析的步骤感兴趣,请参阅我以前的文章。
Our exploratory data visualizations showed that the inflammation in the lungs often obstructs visibility of the heart and the ribcage creating a larger variability around the lung area.
我们的探索性数据可视化结果显示,肺部炎症通常会阻碍心脏和胸腔的可见性,从而在肺部区域产生较大的变异性。
As our baseline model, we will build a simple convolutional neural network that takes in images after resizing them to be a square matrix and normalizing all pixel values to range from 0 to 1. The full step is shown below.
作为我们的基准模型,我们将构建一个简单的卷积神经网络,在将图像调整为正方形矩阵并将所有像素值归一化为0到1之后,将图像接收。完整步骤如下所示。
from tensorflow.keras.preprocessing import image, image_dataset_from_directory from tensorflow.keras import models, layers, optimizers from tensorflow.keras.callbacks import EarlyStopping # initiating generator that rescale and resize the images in a directory train_g = image.ImageDataGenerator(rescale = 1/255).flow_from_directory(train_dir, target_size = (256,256), color_mode='grayscale', class_mode='binary') val_g = image.ImageDataGenerator(rescale = 1/255).flow_from_directory(val_dir, target_size = (256,256), color_mode='grayscale', class_mode='binary') # setting up the architecture model = models.Sequential() model.add(layers.Conv2D(filters = 32, kernel_size = 3, activation = 'relu', padding = 'same', input_shape=(256, 256, 1))) model.add(layers.MaxPooling2D(pool_size = (2, 2))) model.add(layers.Flatten()) model.add(layers.Dense(128, activation = 'relu')) model.add(layers.Dense(1, activation='sigmoid')) # compiling models model.compile(loss='binary_crossentropy', optimizer=optimizers.RMSprop(lr=1e-4), metrics=['accuracy', 'Recall']) # setting up an early stopping callbacks to avoid overfitting # stop if a validation loss is not reduced for 5 epochs cp = EarlyStopping(patience = 5, restore_best_weights=True) # fitting the model history = model.fit(train_g, # fit train generator epochs=100, # it will be stopped before 100 epochs (early stopping) validation_data = val_g, # use the assigned generator as a validation set callbacks = [cp], # use cp as callback verbose = 2 # report each epoch without progress bar ) # evaluating the model model.evaluate(val_g) # evaluate the best weight on validation setNow I will explain each step in detail.
现在,我将详细解释每个步骤。
keras.image.ImageDataGenerator() takes images and create augmented data based on the parameters. Here we are just asking it to rescale all pixel values to be 0 to 1 without specifying any other augmentation parameters. Combined with flow_from_directory, the generator calls images from the directory in the assigned format, then create rescaled data.
keras.image.ImageDataGenerator()拍摄图像并根据参数创建增强数据。 在这里,我们只是要求它在不指定任何其他增强参数的情况下将所有像素值重新缩放为0到1。 与flow_from_directory结合使用,生成器以指定的格式从目录中调用图像,然后创建重新缩放的数据。
keras.models.Sequential() initiates a sequential model. This model will sequentially process the added layers.
keras.models.Sequential()启动顺序模型。 该模型将顺序处理添加的图层。
Conv2D layers are the convolutional layers, which takes the input and runs them through the assigned number of filters. Kernel size refers to the dimension of filters. So in this example, each consecutive group of 3*3 pixels in our 256*256*1 image (1 referring to the number of channels, RGB images have 3 channels, while grayscale images have 1 channel) will run through 32 filters generating 32 feature maps with a size of 256*256*1.
Conv2D层是卷积层,卷积层接受输入并通过指定数量的过滤器运行它们。 内核大小是指过滤器的尺寸。 因此,在此示例中,我们的256 * 256 * 1图像中的每个连续3 * 3像素组(1代表通道数,RGB图像具有3个通道,而灰度图像具有1个通道)将通过32个滤镜生成32尺寸为256 * 256 * 1的地图。
padding = ‘same' is used to add equal paddings around our window since 256 is not divisible by 3.
padding = 'same'用于在窗口周围添加相等的填充,因为256无法被3整除。
activation = 'relu' means that we are assigning the rectified linear unit as our activation function. Simply put, we are telling the layer to convert all our negative values to 0.
activation = 'relu'表示我们将整流线性单位指定为我们的激活函数。 简而言之,我们告诉图层将所有负值都转换为0。
We then feed these outputs from convolutional layers into the pooling layer. The MaxPooling2D layer abstracts the convolved outputs by only leaving the maximum value of each of the 2*2 matrices of the convolved output. Now we would have 32 feature maps with the size of 128*128*1.
然后,我们将这些输出从卷积层馈送到池化层。 MaxPooling2D层仅通过保留卷积输出的2 * 2矩阵中每个矩阵的最大值来抽象卷积输出。 现在我们将拥有32个特征图,大小为128 * 128 * 1。
Now we need to narrow down these 4-dimensional outputs to a single number that can tell us whether to classify the image as pneumonia or normal. We do this by first flattening the layer into a single dimension then running them subsequently through smaller and smaller dense layers. A sigmoid function is applied as an activation function on the final layer because we now want the model to output a probability of whether the output is pneumonia or not.
现在我们需要将这些4维输出缩小到一个单一的数字,该数字可以告诉我们将图像分类为肺炎还是正常。 为此,我们首先将层展平为一个维度,然后将它们依次穿过越来越小的密集层。 在最终层上将S形函数用作激活函数,因为我们现在希望模型输出输出是否为肺炎的概率。
We have defined the architecture of our model. The next step is to decide the goal of this model and how we want it to get there. Using model.compile , we tell the model to minimize the binary cross-entropy loss (the log-loss, think logistic regression) using the gradient-descent. Here we are using the RMSprop algorithm to optimize this process by adaptively decreasing the learning rate. In the later model, I used the AMSGrad algorithm, which performed better for our problem.
我们已经定义了模型的架构。 下一步是确定此模型的目标以及我们希望如何实现它。 使用model.compile ,我们告诉模型使用梯度下降来最小化二进制交叉熵损失(对数损失,认为对数回归)。 在这里,我们使用RMSprop算法通过自适应降低学习率来优化此过程。 在后面的模型中,我使用了AMSGrad算法,该算法对我们的问题表现更好。
Finally, we finished constructing our model. It’s time to fit our training data! Each epoch will run 32 batches by default. We are setting EarlyStopping to prevent overfitting. This model will stop running if our validation loss is not reduced for 5 consecutive epochs. I set the restore_best_weights to be true so it will revert to the highest performing weights after those 5 epochs. It tests its performance on the validation generator we created earlier.
最后,我们完成了模型的构建。 是时候拟合我们的训练数据了! 默认情况下,每个纪元将运行32个批处理。 我们正在设置EarlyStopping以防止过度拟合。 如果我们的验证损失连续5个周期没有减少,则该模型将停止运行。 我将restore_best_weights设置为true,以便在这5个时期之后将其恢复为性能最高的权重。 它在我们之前创建的验证生成器上测试其性能。
Our first model showed 94% accuracy predicting the class of validation data with the log loss of 0.11. Based on the below graph, we can see that training loss has room to improve so we can probably increase the complexity of the model. Also, the validation loss seems to be hovering around 0.1. We can try to improve generalizability by imitating adding more data using data augmentation.
我们的第一个模型显示出94%的准确性,可预测对数损失为0.11的验证数据类别。 根据下图,我们可以看到训练损失有改善的空间,因此我们可能会增加模型的复杂性。 此外,验证损失似乎徘徊在0.1左右。 我们可以尝试通过使用数据增强模拟添加更多数据来提高通用性。
Here’s a full code to plot the loss graph and accuracy graph from the fitted model.
这是完整的代码,用于根据拟合模型绘制损耗图和精度图。
import matplotlib.pyplot as plt %matplotlib inline def plot_performance(hist): ''' takes the fitted model as input plot accuracy and loss ''' hist_ = hist.history epochs = hist.epoch plt.plot(epochs, hist_['accuracy'], label='Training Accuracy') plt.plot(epochs, hist_['val_accuracy'], label='Validation Accuracy') plt.title('Training and validation accuracy') plt.legend() plt.figure() plt.plot(epochs, hist_['loss'], label='Training loss') plt.plot(epochs, hist_['val_loss'], label='Validation loss') plt.title('Training and validation loss') plt.legend() plt.show()Now, we will try to implement data augmentation and add more complexity to our model.
现在,我们将尝试实现数据扩充并为我们的模型增加更多的复杂性。
# redefining training generator data_aug_train = image.ImageDataGenerator(rescale = 1/255, # allow rotation withing 15 degree rotation_range = 15, # adjust range of brightness (1 = same) brightness_range = [0.9, 1.1], # allow shear by up to 5 degree shear_range=5, # zoom range of [0.8, 1.2] zoom_range = 0.2) # attach generator to the directory train_g2 = data_aug_train.flow_from_directory(train_dir, target_size = (256,256), color_mode='grayscale', class_mode='binary') # define architecture model = models.Sequential() model.add(layers.Conv2D(32, 3, activation = 'relu', padding = 'same', input_shape=(256, 256, 1))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, 3, activation = 'relu', padding = 'same')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(128, 3, activation = 'relu', padding = 'same')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(256, 3, activation = 'relu', padding = 'same')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(512, 3, activation = 'relu', padding = 'same')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Flatten()) model.add(layers.Dense(2048, activation = 'relu')) model.add(layers.Dense(1, activation='sigmoid')) # configure model.compile(loss='binary_crossentropy', optimizer=optimizers.Adam(amsgrad = True), metrics=['accuracy']) # train history = model.fit(train_g2, epochs=100, # it won't run all 100 validation_data = val_g, callbacks = [cp], verbose = 2 ) # evaluate model.evaluate(val_g)This time we added some parameters to our train image data generator. Therefore, now our generator will create new images for each batch by applying different rotation, brightness, shear, and zoom within the assigned range to the original image set.
这次,我们在火车图像数据生成器中添加了一些参数。 因此,现在我们的生成器将通过在原始图像集的指定范围内应用不同的旋转,亮度,剪切和缩放来为每个批次创建新图像。
We also increased the model complexity by adding three more sets of convolutional and pooling layers. It’s recommended to increase the number of convolution filters as the layer progresses. This is because as we move through the layers, we are trying to abstract more information, thus requiring larger sets of filters. The analogy is similar to how our brain processes visual information. As the signal moves from our retina to the optic chiasm, to the thalamus, to the primary visual cortex then through the inferior temporal cortex, the receptive fields of neurons at each step grow larger and become more and more sensitive to complex information.
我们还通过增加三组卷积和池化层来增加模型的复杂性。 建议随着图层的进行增加卷积过滤器的数量。 这是因为当我们遍历各层时,我们试图抽象更多的信息,因此需要更多的过滤器集。 这个类比类似于我们的大脑处理视觉信息的方式。 当信号从我们的视网膜移动到视交叉,丘脑,初级视觉皮层,然后穿过下颞叶皮层时,每一步神经元的接受区域变得越来越大,并且对复杂的信息越来越敏感。
Our second model showed 97.3% accuracy with a log loss of 0.075 on a validation set. It seems like our adjustments indeed improved our model! Let’s test it on our testing set to make sure they are generalizing well to the unseen data.
我们的第二个模型在验证集上显示97.3%的准确性,对数损失为0.075。 看来我们的调整确实改善了我们的模型! 让我们在测试集上进行测试,以确保它们很好地概括了看不见的数据。
Our model predicted the class of the X_ray images in the test set with 97.8% accuracy. It successfully identified 97.9% of the pneumonia cases.
我们的模型以97.8%的准确度预测了测试集中的X_ray图像类别。 它成功地鉴定出97.9%的肺炎病例。
Our model showed that it is able to detect close to 98% of the pneumonia cases correctly given our dataset using a convolutional neural network. But especially for the medical problems where lives are at stake, even only 2% of missed cases should not be simply dismissed. In order to understand the limitation of our model, a detail investigation into where the model fails is an important next step.
我们的模型表明,使用卷积神经网络,在给定我们的数据集的情况下,它能够正确检测出近98%的肺炎病例。 但是,特别是对于危及生命的医疗问题,即使只是2%的未遂案件也不应被简单地驳回。 为了理解我们模型的局限性,下一步是对模型失败的地方进行详细调查。
翻译自: https://towardsdatascience.com/detecting-pneumonia-using-convolutional-neural-network-599aea2d3fc9
相关资源:卷积神经网络在肝包虫病CT图像诊断中的应用