n o t e note note:
PIL库读取图像的三通道顺序为RGB,读取图像的宽度( w i d t h width width)和高度( h e i g h t height height)与原始图像一致;PIL库与 n u m p y numpy numpy的转化存在细微的差别: n u m p y . a r r a y ( ) numpy.array() numpy.array()会改变PIL对象的宽度和高度信息, I m a g e . f r o m a r r a y ( ) Image.fromarray() Image.fromarray()会重新调整回原始状态;PIL访问某一位置的像素值时调用 i m g _ p i l . g e t p i x e l ( ( x , y ) ) img\_pil.getpixel((x,y)) img_pil.getpixel((x,y)), n u m p y numpy numpy为矩阵形式,直接访问 i n d e x index index, i m g _ n p [ x , y ] img\_np[x,y] img_np[x,y];n o t e note note:
cv2读取图像的三通道顺序为GBR, 图像的宽度信息和高度信息发生调整;cv2访问元素和 n u m p y numpy numpy的方式相同,通过 i n d e x index index直接访问;cv2可以直接打开 n u m p y numpy numpy数组( u i n t 8 uint8 uint8);为避免cv2闪退,通常加上 c v 2. w a i t K e y ( ) cv2.waitKey() cv2.waitKey()等待键入才退出;n o t e note note:
s k i m a g e skimage skimage库和 c v 2 cv2 cv2比较相似,可以看到结果输出也基本相同,和 n u m p y numpy numpy的转化也比较方便; s k i m a g e skimage skimage库无法直接打开图像,需要借助 m a t p l o t l i b . p y p l o t matplotlib.pyplot matplotlib.pyplot,因此 s k i m a g e skimage skimage通常和 p y p l o t pyplot pyplot合并使用用于过程可视化,可以方便画图、画表格;综上而言,PIL库尽可能保持了原始输入的信息,使用方便快捷,此外,PIL库通常还可以与imageio库相互结合做图像预处理;c v 2 v2 v2将图像转化为数组便于对图像的进一步处理; s k i m a g e skimage skimage和 m a t p l l t l i b matplltlib matplltlib相互结合,做图像对比更加方便;
调用 P y t o r c h Pytorch Pytorch的 D a t a L o a d e r DataLoader DataLoader时需要加载 d a t a s e t dataset dataset,此处的 d a t a s e t dataset dataset为自定义的数据,用于输出图像和对应的标签信息,同时对图像做数据增强,此时的数据类型为PIL对象,此处以Standford_car为例(代码来源:sourcecode):
class STANFORD_CAR(): def __init__(self, input_size, root, is_train=True, data_len=None): self.input_size = input_size self.root = root self.is_train = is_train train_img_path = os.path.join(self.root, 'cars_train') test_img_path = os.path.join(self.root, 'cars_test') train_label_file = open(os.path.join(self.root, 'train.txt')) test_label_file = open(os.path.join(self.root, 'test.txt')) train_img_label = [] test_img_label = [] for line in train_label_file: train_img_label.append([os.path.join(train_img_path, line[:-1].split(' ')[0]), int(line[:-1].split(' ')[1])-1]) for line in test_label_file: test_img_label.append([os.path.join(test_img_path, line[:-1].split(' ')[0]), int(line[:-1].split(' ')[1])-1]) self.train_img_label = train_img_label[:data_len] self.test_img_label = test_img_label[:data_len] def __getitem__(self, index): if self.is_train: img, target = imageio.imread(self.train_img_label[index][0]), self.train_img_label[index][1] if len(img.shape) == 2: img = np.stack([img] * 3, 2) img = Image.fromarray(img, mode='RGB') img = transforms.Resize((self.input_size, self.input_size), Image.BILINEAR)(img) # img = transforms.RandomResizedCrop(size=self.input_size, #scale=(0.4, 0.75),ratio=(0.5,1.5))(img)# # img = transforms.RandomCrop(self.input_size)(img) img = transforms.RandomHorizontalFlip()(img) img = transforms.ColorJitter(brightness=0.2, contrast=0.2)(img) img = transforms.ToTensor()(img) img = transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])(img) else: img, target = imageio.imread(self.test_img_label[index][0]), self.test_img_label[index][1] if len(img.shape) == 2: img = np.stack([img] * 3, 2) img = Image.fromarray(img, mode='RGB') img = transforms.Resize((self.input_size, self.input_size), Image.BILINEAR)(img) # img = transforms.CenterCrop(self.input_size)(img) img = transforms.ToTensor()(img) img = transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])(img) return img, target def __len__(self): if self.is_train: return len(self.train_img_label) else: return len(self.test_img_label)此段代码同时使用了PIL库, n u m p y numpy numpy库,以及相应的 i m a g e i o imageio imageio库进行相应的图像增强。