Pytorch框架下的语义分割实战（一，数据集处理），超详细讲解！！

科技2024-10-23 69

认真学习了这位博主ZJE_ANDY （下文称Z博，如有冒犯，请原谅）的语义分割项目，感谢感谢！！

pytorch用FCN语义分割手提包数据集(训练+预测单张输入图片代码)

分享一下笔记，超详细哦！

首先来看一下dataset.py

Z博整理的数据集有训练集原图（放在了last文件夹下）和训练集标签图（放在last_mask文件夹下），数据集的前期整理代码文件名为BagData.py，后期只需要改一下文件目录就可以啦，多方便呢。。。

将代码附在这里，添加了些注释。

''' BagData.py ''' import os import torch from torch.utils.data import DataLoader, Dataset, random_split from torchvision import transforms import numpy as np import cv2 #transform是对图像进行预处理、数据增强等。Compose将多个处理步骤整合到一起。 #ToTensor：将原始取值0-255像素值，归一化为0-1 #Normalize：用像素值的均值和标准偏差对像素值进行标准化 transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])]) def onehot(data, n): buf = np.zeros(data.shape + (n, )) nmsk = np.arange(data.size)*n + data.ravel() buf.ravel()[nmsk-1] = 1 return buf class BagDataset(Dataset): def __init__(self, transform=None): self.transform = transform def __len__(self): return len(os.listdir('./bags/last')) def __getitem__(self, idx): #读取原图 img_name = os.listdir('./bags/last')[idx] imgA = cv2.imread('./bags/last/'+img_name) imgA = cv2.resize(imgA, (160, 160)) #读取标签图，即二值图 imgB = cv2.imread('/bags/bags/last_msk/'+img_name, 0) imgB = cv2.resize(imgB, (160, 160)) imgB = imgB/255 imgB = imgB.astype('uint8') imgB = onehot(imgB, 2) #因为此代码是二分类问题，即分割出手提包和背景两样就行，因此这里参数是2 imgB = imgB.transpose(2,0,1) #imgB不经过transform处理，所以要手动把(H,W,C)转成(C,H,W) imgB = torch.FloatTensor(imgB) if self.transform: imgA = self.transform(imgA) #一转成向量后，imgA通道就变成(C,H,W) return imgA, imgB bag = BagDataset(transform) train_size = int(0.9 * len(bag)) #整个训练集中，90%为训练集 test_size = len(bag) - train_size train_dataset, test_dataset = random_split(bag, [train_size, test_size]) #按照上述比例(9:1)划分训练集和测试集 train_dataloader = DataLoader(train_dataset, batch_size=4, shuffle=True, num_workers=4) test_dataloader = DataLoader(test_dataset, batch_size=4, shuffle=True, num_workers=4) if __name__ =='__main__': for train_batch in train_dataloader: print(train_batch) for test_batch in test_dataloader: print(test_batch)

下面按照代码顺序，讲解某些语句的含义和作用，如有不当，欢迎指出丫。。。

①transform

torchvision中的transform是对图像进行预处理、数据增强等。 Compose将多个处理步骤整合到一起。 ToTensor：将原始取值0-255像素值，归一化为0-1。 Normalize：用像素值的均值和标准偏差对像素值进行标准化。

②One-Hot编码，又称一位有效编码。

主要采用N位寄存器对N个状态进行编码，每个状态都有它独立的寄存器位，并且在任意时候只有一位有效。此编码是分类变量作为二进制向量的表示。这首先要求将分类值映射到整数值，然后，每个整数值被表示为二进制向量。

③np.arange()

此函数可有1、2、3个参数：当有1个参数时，此参数为终点，起点默认为0，步长默认取1；当有2个参数时，第一个参数是起点，第二个参数是终点，步长默认取1；当有3个参数时，第一个参数为起点，第二个参数为终点，第三个参数为步长，其中步长可取小数。

④np.ravel()

将多维数组降为一维。flatten和ravel作用差不多，来看一下两者的区别。

import numpy as np a = np.array([[1,2],[3,4]]) b = a.flatten() print('b:',b) b[0] = 5 print('a:', a)

输出结果：

import numpy as np a1 = np.array([[1,2],[3,4]]) b1 = a1.ravel() print('b1:', b1) b1[0] = 5 print('a1:', a1)

输出结果：

可以发现，flatten不可改变原变量中的值，但是ravel可以为所欲为呦。

⑤Class

类：用来描述具有相同属性和方法的对象的集合。定义了该集合中每个对象所共有的属性和方法。对象是类的实例

类变量：类变量在整个实例化的对象中是公用的。类变量定义在类中且在函数体外。类变量通常不作为实例变量使用。

局部变量：定义在函数中的变量，只作用于当前实例的类。

数据成员：类变量或者实例变量，用于处理类及其实例对象的相关数据

方法重写：如果从父类继承的方法不能满足子类的需求，可以对其进行改写，此过程叫做方法的覆盖，也叫做方法的重写。

实例变量：在类的声明中，属性是用变量来表示的。这种变量称为实例变量，是在类声明的内部，但是在类的其他成员方法之外声明的。创建实例时声明

继承：一个派生类继承基类的字段和方法。继承也允许把一个派生类的对象作为一个基类对象对待。

方法：类中定义的函数。

对象：通过类定义的数据结构实例。对象包括两个数据成员（类变量和实例变量）和方法。

class Employee(): empCount = 0 #emCount就是类变量 def __init__(self): print('name,salary') e = Employee() print(e.empCount) ''' 输出： name,salary 0 '''

''' 创建一个‘鱼’类 ''' class fish(): def weight(self,w): print('鱼的重量：', w) cat = fish() cat.weight(100) ''' 输出：鱼的重量： 100 '''

关于self: 1.self代表类的实例；2.self可以用其他词替代；3.self必须定义，但不需要手动赋值。

在fish这个类中定义了函数weight()，函数有两个参数:self和w，我们只需要在创建函数时定义即可，传值调用时不用管。

当想要给fish类设置一些属性，初始化创建的实例化对象时，就可以定义一个init函数，在创建对象时自动进行初始化。

class fish(): def __init__(self): print('鱼离不开水。') def weight(self,w): print('鱼的重量：',w,'g') cat = fish() cat.weight(100) ''' 输出：鱼离不开水。鱼的重量： 100 g '''

鱼有很多种类，下面进行细分。

从fish()这个父类的基础上创建一个子类。子类继承父类一些属性，比如‘鱼离不开水’，但又有自己独特的属性。接下来通过class CaoYu(fish)定义fish父类下的CaoYu子类。

''' class x(y) 创建子类的一般方法，创建一个x类时y类的子类， ''' class fish(): def __init__(self): print('fish can not live without warter.') def weight(self,w): print('the weight of fish is:', w) class CaoYu(fish): def outlook(self): print('this fish is beautiful!') cat = CaoYu() cat.weight(100) cat.outlook() ''' 输出： fish can not live without warter. the weight of fish is: 100 this fish is beautiful! ''' ''' super.__init__()，这一块是model.py中的内容，方便起见，放在此处。用法：当想要继承父类构造函数中的内容，且子类需要在父类的基础尚上补充时，使用该方法。 ''' #定义函数名时不要和变量名重复，否则会报错 class person(): def __init__(self,name,age): self.name = name self.age = age def name1(self,name): print("this person's function name is", name) print("this person's class name is", self.name) def age1(self,age): print("this person's function age is", age) print("this person's class age is", self.age) class new_person(person): def __ini__(self,new_name,new_age,sex): super().__init__(new_name,new_age) self.sex = sex def diaoyong(self,name,age): self.name1(name) self.age1(age) def sex1(self,sex): print("this new person's function sex is", sex) #print("this new person's class sex is", self.sex) #加上这句话，就会报错：new_person中没有sex,因为父类person中没有sex new_p = new_person('Tom','20') new_p.diaoyong('Michael','13') new_p.sex1('female') ''' Out: this person's function name is Michael this person's class name is Tom this person's function age is 13 this person's class age is 20 this new person's function sex is female '''

子类new_person即需要用到父类中的name和age，又需要新增sex，这时就需要用super().__init__()来继承父类中的name和age。由于父类中的__init__()只有两个参数，所以super也只需传两个参数。

若super().__init__()中获取了子类中传的值，继承了父类，但在子类的__init__中又把参数重定义了，再调用的时候，还是优先调用子类中的值。此时super().__init__()其实没啥用了，如下所示。

class person(): def __init__(self,name): self.name = name def name1(self,name): print("this person's function name is", name) print("this person's class name is", self.name) class new_person(person): def __ini__(self,new_name): super().__init__(new_name) def diaoyong(self,name): self.name = 'Amber' self.name1(name) new_p = new_person('Tom') new_p.diaoyong('Michael') ''' Out: this person's function name is Michael this person's class name is Amber '''

⑥os.listdir()

此函数以列表的形式返回指定的文件夹内的文件或文件夹名字。支持Win或Unix系统。

import os path = ('c:/users/w1998/desktop/jupyter-Code') dirs = os.listdir(path) dir0 = os.listdir(path)[2] print(dirs) print('\n',dir0) ''' Out: ['.ipynb_checkpoints', 'Bags-Notes.ipynb', 'Bags.ipynb', 'cifar.h5', 'Cifar10.ipynb', 'Cluster.ipynb', 'Flower.ipynb', 'LSTM.ipynb', 'Net.ipynb', 'practice.ipynb', 'Song.ipynb', 'VectorMachine.ipynb'] '''

⑦OpenCV

OpenCV处理图片非常方便而且很强大，理论什么的在此就不啰嗦了，感兴趣者可以查官方文档学习，在此只提及Z博里面用到的函数。

import cv2 img = cv2.imread('C:/Users/W1998/Desktop/a.jpg') img0 = cv2.imread('C:/Users/W1998/Desktop/a.jpg',0) #加0,将图片转换为灰度图 cv2.imshow('Original', img) cv2.imshow('Grey', img0) cv2.waitKey(0) #等待任意键输入 cv2.destroyAllWindows() #关闭所有图窗口 img1 = cv2.resize(img, (128,128)) #将图片大小resize为128*128 cv2.imshow('resize', img1) cv2.waitKey(0) cv2.destroyAllWindows() ''' uint8:无符号8位整数。处理图像过程中，RGB图像值取值范围为0-255，为了更好地处理图像，通常将像素值归一化到0-1范围内。所以需要将像素值先取整然后除以255：img.astype('uint8')/255，这样像素值就从int类型转换为float类型 ''' img2 = img.astype('uint8') cv2.imshow('int', img2) img3 = img.astype('uint8')/255 cv2.imshow('float', img3) cv2.waitKey(0) cv2.destroyAllWindows()

⑧FloatTensor

torch.Tensor是默认的tensor类型（torch.FloatTensor）的简称。返回的值是float类型

import torch a = torch.FloatTensor([[1,2,3],[4,5,6]]) ''' Out: tensor([[1., 2., 3.], [4., 5., 6.]]) '''

那么将Tensor函数运用到图中，输出的结果是什么样子呢？读者可以放一张自己喜欢的图片，看一下运行结果

import matplotlib.pyplot as plt import matplotlib.image as image import numpy as np import torch img = image.imread('c:/users/w1998/desktop/bag2.png') img1 = torch.FloatTensor(img) plt.imshow(img) print(img1) ''' Out: tensor([[[0.5412, 0.5451, 0.5216, 1.0000], [0.5412, 0.5451, 0.5216, 1.0000], [0.5451, 0.5490, 0.5255, 1.0000], ..., [0.4588, 0.4431, 0.4078, 1.0000], [0.4588, 0.4471, 0.4118, 1.0000], [0.4588, 0.4471, 0.4118, 1.0000]], [[0.5333, 0.5373, 0.5137, 1.0000], [0.5333, 0.5451, 0.5176, 1.0000], [0.5373, 0.5490, 0.5216, 1.0000], ..., [0.4627, 0.4471, 0.4118, 1.0000], [0.4588, 0.4471, 0.4118, 1.0000], [0.4627, 0.4510, 0.4157, 1.0000]], [[0.5294, 0.5333, 0.5137, 1.0000], [0.5333, 0.5294, 0.5137, 1.0000], [0.5373, 0.5333, 0.5176, 1.0000], ..., [0.4627, 0.4431, 0.4196, 1.0000], [0.4667, 0.4510, 0.4157, 1.0000], [0.4667, 0.4510, 0.4078, 1.0000]], ..., [[0.8980, 0.9020, 0.8824, 1.0000], [0.8980, 0.9020, 0.8824, 1.0000], [0.8980, 0.9020, 0.8824, 1.0000], ..., [0.9059, 0.9098, 0.8902, 1.0000], [0.8980, 0.9020, 0.8824, 1.0000], [0.8941, 0.8980, 0.8784, 1.0000]], [[0.8980, 0.9020, 0.8824, 1.0000], [0.8980, 0.9020, 0.8824, 1.0000], [0.8980, 0.9020, 0.8824, 1.0000], ..., [0.9098, 0.9137, 0.8941, 1.0000], [0.8980, 0.9020, 0.8824, 1.0000], [0.8941, 0.8980, 0.8784, 1.0000]], [[0.8941, 0.8980, 0.8784, 1.0000], [0.8980, 0.9020, 0.8824, 1.0000], [0.8980, 0.9020, 0.8824, 1.0000], ..., [0.9020, 0.9059, 0.8863, 1.0000], [0.8980, 0.9020, 0.8824, 1.0000], [0.9020, 0.9059, 0.8863, 1.0000]]]) '''

⑨DataLoader

DataLoader负责将数据分批，来看一下其参数的含义吧

dataset:需要进行分批的数据集。

batch_size(int,optional):每个batch的样本数量，默认值为1。

shuffle(bool,optional):开始训练每个epoch前，选择是否将数据打乱。

sampler(Sampler,optional):自定义从数据集中取样本的方式。如果设置这个参数，那么shuffle必需设为False。

batch_sampler(Sampler,optional):与sampler相似，但一次只返回一个batch的indices。如果设置此参数，batch_size,shuffle,drop_last就不能再设置了。

num_workers(int,optional):指定有几个进程处理data loading。默认值为0，此时所有数据都会被load进主进程。

collate_fn(callable,optional):将一个列表的sample组成一个mini-batch的函数

pin_memory(bool,optional):当为True时，DataLoader会在返回此参数之前，将tensors拷贝到cuda中的固定内存(CUDA pinned memory)中。

drop_last(bool,optional):当为True时，最后一个未完成的batch会被删除。当为False时，最后一个batchSize会小一些。默认值为False。

timeout(numeric,optional):必须设为正数，等待从worker进程中手机一个batch所花费的时间，若超过设定的时间还未收集到，那就不收集此内容了。默认值为0.

worker_init_fn(callable,optional):每个worker初始化函数。若为None，将在每个worker紫禁城上调用，以[0,num_workers-1]范围内的整数值作为输入。默认值为None。

⑩if _name == " _main ":

此语句有两种作用：①直接作为脚本执行；②import到其他py文件中执行。当在第一种情况下，才会被执行，第二种情况不会被执行。

print('this is a term.') if __name__ =='__main__': print('this is a function.') ''' Out: this is a term. this is a function. ''' ''' 将 if __name__ =='__main__': print('this is a function.') 代码放入a.py中，并将这两个文件（Bag-Notes.ipynb和a.py）放在同一个文件夹下。 ''' import a print('this is a term.') ''' Out: this is a term. '''

第二段只输出了‘this is a term.’这句话。并没有执行a.py中的代码。

运行的原理：每个python模块都包含内置的变量__name__，当指定模块被执行时，__name__相当于文件名，而且包含后缀.py。如果此代码被import到其他文件中，则__name__相当于模块名称，且不包含后缀。而"__main__"等于当前执行文件的名称，且包含后缀。进而当模块被执行时，__name__=='__main__'的结果为真。下面举个例子来看一下吧

print('Hi,I am the first.') print(__name__) if __name__ =='__main__': print('Hi, I am the second.') ''' Out： Hi,I am the first. __main__ Hi, I am the second. '''

将上一个cell中代码放入a.py中，并在命令框中执行，结果如下。

可见此时输出的名字为a。

篇幅太长啦，读者们估计都累了吧，下一篇博文见吧。。。ヾ(✿ﾟ▽ﾟ)ノ

Processed: 0.466, SQL: 8