当可视化卷积神经网络的中间层时,只考虑卷积层和池化层,因为只有这两类网络层的输出是特征图;但到了全连接层,输入已经被“压平”成一个数组,不适合可视化了。
import numpy as np import keras from keras.datasets import mnist from keras.models import load_model,Model from keras import backend as K from matplotlib import pyplot as plt (x_train,y_train),(x_test,y_test)=mnist.load_data() image=x_train[0] plt.imshow(image) #把训练好的卷积神经网络模型加载出来 model=load_model('mnist_cnn.h5') print(model.summary()) # 中间层的可视化 首先提取前3层的输出,因为这个模型只有前3层是卷积层和池化层 layer_outputs=[layer.output for layer in model.layers[:3]] #使用Model创建一个模型,这个模型的输入和原模型一致,但有多个输出 就是所提取的前3层 activation_model=Model(inputs=model.input,outputs=layer_outputs) #此时输入一张图像 模型将会返回原模型前3层的结果 ''' 先将这张图像转换成要求的输入形状和尺寸 然后再输入到定义好的新模型中 这个模型会有3个输出 对应前3层中每一层网络的结果 first_layer_activation 第一层 (1,26,26,32)矩阵 特征图的尺寸26*26 有32个通道即32个特征图 ''' image=np.reshape(image,(1,28,28,1)) activations=activation_model.predict(image) # first_layer_activation=activations[0] # plt.matshow(first_layer_activation[0,:,:,1],cmap='viridis') layer_names=[] for layer in model.layers[:3]: layer_names.append(layer.name) images_per_row=16#绘图时将16张图放在一行 for layer_name,layer_activation in zip(layer_names,activations): #特征图的形状为 (1,size,size,n_features) n_features=layer_activation.shape[-1] size=layer_activation.shape[1] #定义一个网格 把特征图平铺在这个上面 n_cols=n_features//images_per_row display_grid=np.zeros((size*n_cols,images_per_row*size)) for col in range(n_cols): for row in range(images_per_row): channel_image=layer_activation[0,:,:,col*images_per_row+row] channel_image-=channel_image.mean() channel_image/=channel_image.std() channel_image*=64 channel_image+=128 channel_image=np.clip(channel_image,0,255).astype('uint8') display_grid[col*size:(col+1)*size,row*size:(row+1)*size]=channel_image scale=1./size plt.figure(figsize=(scale*display_grid.shape[1],scale*display_grid.shape[0])) plt.title(layer_name) plt.grid(False) plt.imshow(display_grid,aspect='auto',cmap='viridis') plt.savefig("layer_name"+layer_name+".jpg") Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 26, 26, 32) 320 _________________________________________________________________ conv2d_1 (Conv2D) (None, 24, 24, 64) 18496 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 12, 12, 64) 0 _________________________________________________________________ dropout (Dropout) (None, 12, 12, 64) 0 _________________________________________________________________ flatten (Flatten) (None, 9216) 0 _________________________________________________________________ dense (Dense) (None, 128) 1179776 _________________________________________________________________ dropout_1 (Dropout) (None, 128) 0 _________________________________________________________________ dense_1 (Dense) (None, 10) 1290 ================================================================= Total params: 1,199,882 Trainable params: 1,199,882 Non-trainable params: 0 _________________________________________________________________ None
思想:对输入图像做梯度上升。 先输入一个空白图像,并通过梯度来更新该输入图像的值,目的是使指定的过滤器的损失值最大化,这代表了过滤器对输入图像的响应最大化。经过这一梯度上升的过程,将会得到使指定过滤器具有最大响应的图像。即指定过滤器对这一类型的图像非常敏感,对于给定的输入图像,该过滤器更容易提取到图像中这方面的特征。
