取代码中第一个字

科技2022-07-12 120

取代码中第一个字

In this article, I’ll be illustrating how to approach a core computer vision problem known as semantic segmentation. Simply put, semantic segmentation’s goal is to simply classify each pixel in a given image to a particular class according to what is shown in the image.

在本文中，我将说明如何解决核心计算机视觉问题，即语义分割。简而言之，语义分割的目标是根据图像中显示的内容将给定图像中的每个像素简单地分类为特定类别。

LNDST is a classic example of semantic segmentation which can be solved using CNNs. The Landsat dataset consists of 400x400 RGB satellite images that have been taken from the Landsat 8 satellite. In each image, there can be water and background. Our classifier should predict each pixel as 0 - background or 1 - water. The metric for ranking is the F1/dice score.

LNDST是语义分割的经典示例，可以使用CNN来解决。 Landsat数据集由从Landsat 8卫星获取的400x400 RGB卫星图像组成。在每个图像中，可能有水和背景。我们的分类器应将每个像素预测为0 - background或1 - water 。排名的度量标准是F1 / dice分数。

We’ll be using FastAI v1 to approach this problem. FastAI is a popular wrapper that works in tandem with the PyTorch framework. I chose FastAI for solving this problem since it provides several features like learning rate finder, data loaders which can be created with a couple of lines of code, and several other goodies. Make sure you have downloaded the dataset and extracted it to a folder named data. Let’s start!

我们将使用FastAI v1来解决此问题。 FastAI是与PyTorch框架协同工作的流行包装器。我选择FastAI来解决此问题，因为它提供了多种功能，例如学习率查找器，可以用几行代码创建的数据加载器，以及其他一些功能。确保已下载数据集并将其提取到名为data的文件夹中。开始吧！

from fastai.vision import *

The above line imports FastAI’s vision module.

上一行导入了FastAI的视觉模块。

path = Path('data')path_img = path/'train_images'path_lbl = path/'train_gt'img_names = get_image_files(path_img)lbl_names = get_image_files(path_lbl)

img_names and lbl_names are lists containing the training images and their respective masks.

img_names和lbl_names是包含训练图像及其各自蒙版的列表。

# Batch Sizebs = 8# Labelslabels = ['background', 'water']# Mapping fuction mapping x names and y namesdef get_y_fn(x): dest = x.name.split('.')[0] + '.png'\ return path_lbl/destsrc = (SegmentationItemList.from_folder(path_img) # Load in x data from folder .split_by_rand_pct() # Split data into training and validation set .label_from_func(get_y_fn, classes=labels) # Label data using the get_y_fn function)# Define our image augmentationstfms = get_transforms(flip_vert=True, max_lighting=0.1, max_zoom=1.05, max_warp=0.)data = (src.transform(tfms, size=400, tfm_y=True) # Augments the images and the mask .databunch(bs=bs) # Create a databunch .normalize(imagenet_stats) # Normalize for imagenet mean and std)

The above code creates an ImageDataBunch object which deals with all the aspects of handling data like preprocessing, augmentations, splitting into training and validation sets, and so on. Let us now take a look at a mini-batch of our data.

上面的代码创建了一个ImageDataBunch对象，该对象处理处理数据的所有方面，例如预处理，扩充，拆分为训练集和验证集等。现在让我们看一下我们的数据的小批量。

data.show_batch(8, figsize=(10,10)) This is a random mini-batch after applying random transformations like rotation, flipping, etc. 这是应用旋转，翻转等随机变换后的随机小批量。

Now that our data is ready, let us create a model and train it. There are several architectures that can be used to solve a segmentation task like U-Net, FPN, DeepLabV3, PSPNet. We’ll be using U-Net in this article.

现在我们的数据已经准备就绪，让我们创建一个模型并对其进行训练。有几种可用于解决分段任务的体系结构，例如U-Net，FPN，DeepLabV3，PSPNet。在本文中，我们将使用U-Net。

# Pretrained Encoderencoder = learn = unet_learner(data, encoder, metrics=dice)

FastAI’s unet_learner method creates an instance of the Learner class. Learner class handles the complete training loop and printing the specified metrics. This method particularly constructs a U-Net like architecture with the given encoder and loads the imagenet pretrained weights only for the encoder part. If you are unsure about how U-Nets work, check out this paper. Note that we are passing dice as a metric which will give us an idea of how our model might perform on the test set.

FastAI的unet_learner方法创建Learner类的实例。学习者课程处理完整的培训循环并打印指定的指标。此方法特别是使用给定的编码器构建类似于U-Net的体系结构，并仅对编码器部分加载imagenet预训练的权重。如果不确定U-Nets的工作方式，请查阅本文。请注意，我们将骰子作为度量传递，这将使我们对模型在测试集上的表现方式有所了解。

learn.lr_find()learn.recorder.plot()

The graph that’s plotted gives us an idea of what the optimal learning rate might be. According to FastAI, the optimal learning rate would be the steepest downward slope in the graph where the loss slides down fast to the minima. In this case, it can be anywhere around 1e-5.

绘制的图形使我们对最佳学习率可能有所了解。根据FastAI的说法，最佳学习率将是图中的最陡峭的向下斜率，其中损耗快速下滑至最小值。在这种情况下，它可以在1e-5附近。

# Fit the modellearn.fit_one_cycle(10, 1e-5)

Now, we run 10 epochs with a maximum learning rate of 1e-5. FastAI uses one cycle policy for learning rate scheduling which was mentioned in this paper. This initial training updates the parameters of only the decoder retaining the weights of the pre-trained encoder. Once our model is trained well enough, we can unfreeze the encoder as well and train some more.

现在，我们运行10个时期，最大学习率为1e-5。 FastAI使用提供了在此提到的学习速率调度一个周期的政策文件。该初始训练仅更新保留预训练编码器权重的解码器的参数。一旦对模型进行了足够好的训练，我们就可以解冻编码器并进行更多训练。

# Unfreeze and train some morelearn.unfreeze()learn.fit_one_cycle(10, slice(1e-6, 1e-5))

Now, we train for some more epochs with discriminative learning rates where the earlier layers are trained with a lower maximum learning rate, and the learning rates are increased for the subsequent layer groups. Now that our model is trained, we will visually inspect if the model works fine.

现在，我们以判别式学习率训练更多的时期，其中以较低的最大学习率训练较早的层，并为随后的层组增加学习率。现在我们的模型已经训练完毕，我们将目视检查模型是否工作正常。

learn.show_results(rows=3, figsize=(10,10))

Now that everything is set, we can run inference on the test set and make a submission!

既然一切都已设置好，我们就可以对测试集进行推断并提交！

from glob import globlst = sorted(glob('.data/test_images/*') , key=lambda x: int(x.split('_')[-1].split('.')[0]))main_array = []for i in lst: # Open image img = open_image(i) mask = learn.predict(img)[0] # Convert torch tensor to numpy array mask = mask.data.numpy() # Flatten the array mask = mask.flatten() main_array.append(mask)main_array = np.asarray(main_array)main_array_flat = np.reshape(main_array,(-1)).astype(np.uint8)with open('submission.npy', 'wb') as f: np.save(f,main_array_flat)

The above code runs inference on the test set and creates a submission file that can be submitted to the AICrowd website.

上面的代码在测试集上进行推理，并创建一个可以提交到AICrowd网站的提交文件。

结论 (Conclusion)

This is definitely not the complete solution that might get you the best result on the leaderboard. But this is definitely a great starting point upon which one could develop their solution. A few improvements to the above solution could be doing cross-validation, ensembling, test time augmentation, checking out different loss functions other than the cross-entropy loss, and so on. I’d like to conclude by congratulating AICrowd’s team for creating this wonderful platform and conducting this awesome and short spanned blitz competition which definitely encourages beginners to step into the world of machine learning.

这绝对不是可以在排行榜上获得最佳结果的完整解决方案。但这绝对是开发解决方案的一个很好的起点。对上述解决方案的一些改进可以是进行交叉验证，集合，测试时间增加，检查除交叉熵损失以外的其他损失函数，等等。最后，我要祝贺AICrowd的团队创建了这个出色的平台，并进行了这场令人敬畏的短时间闪电战，这肯定会鼓励初学者进入机器学习的世界。

翻译自: https://medium.com/@ashwinr64/approaching-aicrowds-lndst-problem-in-under-50-lines-of-code-b8b5fb536f2b

取代码中第一个字

Processed: 0.013, SQL: 8