图像读取库合集——cv2, PIL, skimage与numpy, pytorch(ToPILimage)

    科技2022-08-08  107

    图像读取库合集——cv2, PIL, skimage与numpy, pytorch(ToPILimage)

    1 图像读取与属性

    1.1 PIL与numpy间的相互访问
    import numpy as np from PIL import Image #read a image with 3 channels, 500x889 pixels img_pil = Image.open('./test.png') #show a image img_pil.show() #get image imfo print(img_pil) #get the pixel value in PIL format print(img_pil.getpixel((0,0))) #covert PIL to numpy img_np = np.array(img_pil) print(img_np.shape) #get the pixel value in numpy format print(img_np[0,0]) #convert numpy to PIL img_pil = Image.fromarray(img_np) print(img_pil) """ <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x889 at 0x193331AD240> (219, 210, 193) (889, 500, 3) [219 210 193] <PIL.Image.Image image mode=RGB size=500x889 at 0x1933330ADA0> """

    n o t e note note

    PIL库读取图像的三通道顺序为RGB,读取图像的宽度( w i d t h width width)和高度( h e i g h t height height)与原始图像一致;PIL库与 n u m p y numpy numpy的转化存在细微的差别: n u m p y . a r r a y ( ) numpy.array() numpy.array()会改变PIL对象的宽度和高度信息, I m a g e . f r o m a r r a y ( ) Image.fromarray() Image.fromarray()会重新调整回原始状态;PIL访问某一位置的像素值时调用 i m g _ p i l . g e t p i x e l ( ( x , y ) ) img\_pil.getpixel((x,y)) img_pil.getpixel((x,y)), n u m p y numpy numpy为矩阵形式,直接访问 i n d e x index index, i m g _ n p [ x , y ] img\_np[x,y] img_np[x,y]
    1.2 cv2与numpy间的相互访问
    import numpy as np import cv2 #read a image with 3 channels, 500x889 pixels img_cv = cv2.imread('./test.png') #show a image cv2.imshow('img', img_cv) #get image imfo print(img_cv.shape) #get the pixel value in cv2 format print(img_cv[0,0]) #covert cv2 to numpy img_np = np.array(img_cv) print(img_np.shape) #get the pixel value in numpy format print(img_np[0,0]) #convert numpy to cv2(not necessary) cv2.imshow('img_np', img_np) cv2.waitKey(0) """ (889, 500, 3) [193 210 219] (889, 500, 3) [193 210 219] """

    n o t e note note:

    cv2读取图像的三通道顺序为GBR, 图像的宽度信息和高度信息发生调整;cv2访问元素和 n u m p y numpy numpy的方式相同,通过 i n d e x index index直接访问;cv2可以直接打开 n u m p y numpy numpy数组( u i n t 8 uint8 uint8);为避免cv2闪退,通常加上 c v 2. w a i t K e y ( ) cv2.waitKey() cv2.waitKey()等待键入才退出;
    1.3 skimg与numpy间的相互访问
    import numpy as np from skimage import io, transform import matplotlib.pyplot as plt #read a image with 3 channels, 500x889 pixels img_sk = io.imread('./test.png') #get image info print(img_sk.shape) io.imshow(img_sk) #get the pixel value in skimage format print(img_sk[0,0]) #covert skimage to numpy img_np = np.array(img_sk) print(img_np.shape) #get the pixel value in numpy format print(img_np[0,0]) #convert numpy to skimg io.imshow(img_np) plt.show() """ (889, 500, 3) [219 210 193] (889, 500, 3) [219 210 193] """

    n o t e note note:

    s k i m a g e skimage skimage库和 c v 2 cv2 cv2比较相似,可以看到结果输出也基本相同,和 n u m p y numpy numpy的转化也比较方便; s k i m a g e skimage skimage库无法直接打开图像,需要借助 m a t p l o t l i b . p y p l o t matplotlib.pyplot matplotlib.pyplot,因此 s k i m a g e skimage skimage通常和 p y p l o t pyplot pyplot合并使用用于过程可视化,可以方便画图、画表格;

    综上而言,PIL库尽可能保持了原始输入的信息,使用方便快捷,此外,PIL库通常还可以与imageio库相互结合做图像预处理;c v 2 v2 v2将图像转化为数组便于对图像的进一步处理; s k i m a g e skimage skimage m a t p l l t l i b matplltlib matplltlib相互结合,做图像对比更加方便;

    2 Pytorch读取图像

    torch.utils.data.DataLoader(dataset, batch_size=batch_size, shuffle=False, num_workers=8, drop_last=False)

    调用 P y t o r c h Pytorch Pytorch D a t a L o a d e r DataLoader DataLoader时需要加载 d a t a s e t dataset dataset,此处的 d a t a s e t dataset dataset为自定义的数据,用于输出图像和对应的标签信息,同时对图像做数据增强,此时的数据类型为PIL对象,此处以Standford_car为例(代码来源:sourcecode):

    class STANFORD_CAR(): def __init__(self, input_size, root, is_train=True, data_len=None): self.input_size = input_size self.root = root self.is_train = is_train train_img_path = os.path.join(self.root, 'cars_train') test_img_path = os.path.join(self.root, 'cars_test') train_label_file = open(os.path.join(self.root, 'train.txt')) test_label_file = open(os.path.join(self.root, 'test.txt')) train_img_label = [] test_img_label = [] for line in train_label_file: train_img_label.append([os.path.join(train_img_path, line[:-1].split(' ')[0]), int(line[:-1].split(' ')[1])-1]) for line in test_label_file: test_img_label.append([os.path.join(test_img_path, line[:-1].split(' ')[0]), int(line[:-1].split(' ')[1])-1]) self.train_img_label = train_img_label[:data_len] self.test_img_label = test_img_label[:data_len] def __getitem__(self, index): if self.is_train: img, target = imageio.imread(self.train_img_label[index][0]), self.train_img_label[index][1] if len(img.shape) == 2: img = np.stack([img] * 3, 2) img = Image.fromarray(img, mode='RGB') img = transforms.Resize((self.input_size, self.input_size), Image.BILINEAR)(img) # img = transforms.RandomResizedCrop(size=self.input_size, #scale=(0.4, 0.75),ratio=(0.5,1.5))(img)# # img = transforms.RandomCrop(self.input_size)(img) img = transforms.RandomHorizontalFlip()(img) img = transforms.ColorJitter(brightness=0.2, contrast=0.2)(img) img = transforms.ToTensor()(img) img = transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])(img) else: img, target = imageio.imread(self.test_img_label[index][0]), self.test_img_label[index][1] if len(img.shape) == 2: img = np.stack([img] * 3, 2) img = Image.fromarray(img, mode='RGB') img = transforms.Resize((self.input_size, self.input_size), Image.BILINEAR)(img) # img = transforms.CenterCrop(self.input_size)(img) img = transforms.ToTensor()(img) img = transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])(img) return img, target def __len__(self): if self.is_train: return len(self.train_img_label) else: return len(self.test_img_label)

    此段代码同时使用了PIL库, n u m p y numpy numpy库,以及相应的 i m a g e i o imageio imageio库进行相应的图像增强。

    Processed: 0.009, SQL: 8