生成对抗网络 图像生成
Machines are generating perfect images these days and it’s becoming more and more difficult to distinguish the machine-generated images from the originals.
如今,机器正在生成完美的图像,将机器生成的图像与原始图像区分开来变得越来越困难。
If you are reading this article, I am sure that we share similar interests and are/will be in similar industries. So let’s connect via Linkedin! Please do not hesitate to send a contact request! Orhan G. Yalçın — Linkedin
如果您正在阅读本文,我相信我们拥有相似的兴趣并且将会/将会从事相似的行业。 因此,让我们通过Linkedin连接! 请不要犹豫,发送联系请求! Orhan G.Yalçın— Linkedin
3] 3 ]After receiving more than 300k views for my article, Image Classification in 10 Minutes with MNIST Dataset, I decided to prepare another tutorial on deep learning. But this time, instead of classifying images, we will generate images using the same MNIST dataset, which stands for Modified National Institute of Standards and Technology database. It is a large database of handwritten digits that is commonly used for training various image processing systems[1].
在我的文章“使用MNIST数据集在10分钟内进行图像分类”获得了超过30万的观看次数之后,我决定编写另一篇有关深度学习的教程。 但是这一次,我们将使用相同的MNIST数据集生成图像,而不是对图像进行分类,该数据集代表改良的美国国家标准技术研究院数据库。 它是一个庞大的手写数字数据库,通常用于训练各种图像处理系统[1]。
To generate -well basically- anything with machine learning, we have to use a generative algorithm and at least for now, one of the best performing generative algorithms for image generation is Generative Adversarial Networks (or GANs).
为了通过机器学习基本生成任何东西,我们必须使用一种生成算法,并且至少到目前为止,用于生成图像的性能最好的生成算法之一是“生成对抗网络”(GAN)。
The invention of GANs has occurred pretty unexpectedly. The famous AI researcher, then, a Ph.D. fellow at the University of Montreal, Ian Goodfellow, landed on the idea when he was discussing with his friends -at a friend’s going away party- about the flaws of the other generative algorithms. After the party, he came home with high hopes and implemented the concept he had in mind. Surprisingly, everything went as he hoped in the first trial [5] and he successfully created the Generative Adversarial Networks (shortly, GANs). According to Yann Lecun, the director of AI research at Facebook and a professor at New York University, GANs are “the most interesting idea in the last 10 years in machine learning” [6].
GAN的发明已经出乎意料地发生了。 当时著名的AI研究人员是博士学位。 蒙特利尔大学的研究员伊恩·古德费洛( Ian Goodfellow )在与朋友(在一个朋友走开的聚会上)讨论其他生成算法的缺陷时,想到了这个想法。 晚会结束后,他满怀希望地回到了家,并实现了他所构想的构想。 出乎意料的是,一切都如他在第一次试验中所希望的那样[ 5 ],并且他成功创建了Generative Adversarial Networks(简称GAN)。 根据Facebook AI研究总监,纽约大学教授Yann Lecun的说法,GAN是“过去10年来机器学习中最有趣的想法” [ 6 ]。
The rough structure of the GANs may be demonstrated as follows:
GAN的粗略结构可以证明如下:
Figure 4. Generative Adversarial Networks (GANs) utilizing CNNs | (Graph by author) 图4.使用CNN的生成对抗网络(GAN)| (作者图表)In an ordinary GAN structure, there are two agents competing with each other: a Generator and a Discriminator. They may be designed using different networks (e.g. Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), or just Regular Neural Networks (ANNs or RegularNets)). Since we will generate images, CNNs are better suited for the task. Therefore, we will build our agents with convolutional neural networks.
在普通GAN结构中,有两个代理相互竞争:生成器和鉴别器。 可以使用不同的网络(例如卷积神经网络( CNN ),递归神经网络( RNN )或仅常规神经网络( ANN或RegularNets))来设计它们。 由于我们将生成图像,因此CNN更适合该任务。 因此,我们将使用卷积神经网络构建代理。
In a nutshell, we will ask the generator to generate handwritten digits without giving it any additional data. Simultaneously, we will fetch the existing handwritten digits to the discriminator and ask it to decide whether the images generated by the Generator are genuine or not. At first, the Generator will generate lousy images that will immediately be labeled as fake by the Discriminator. After getting enough feedback from the Discriminator, the Generator will learn to trick the Discriminator as a result of the decreased variation from the genuine images. Consequently, we will obtain a very good generative model which can give us very realistic outputs.
简而言之,我们将要求生成器在不提供任何其他数据的情况下生成手写数字。 同时,我们将现有的手写数字提取给鉴别器,并要求鉴别器确定生成器生成的图像是否真实。 首先,生成器将生成糟糕的图像,这些图像将立即由鉴别器标记为伪造的。 从鉴别器获得足够的反馈后,生成器将学习欺骗鉴别器,因为与真实图像的差异减小了。 因此,我们将获得一个非常好的生成模型,该模型可以为我们提供非常实际的输出。
GANs often use computationally complex calculations and therefore, GPU-enabled machines will make your life a lot easier. Therefore, I will use Google Colab to decrease the training time with GPU acceleration.
GAN经常使用复杂的计算,因此,支持GPU的计算机将使您的生活更加轻松。 因此,我将使用Google Colab通过GPU加速来减少训练时间。
For machine learning tasks, for a long time, I used to use -iPython- Jupyter Notebook via Anaconda distribution for model building, training, and testing almost exclusively. Lately, though, I have switched to Google Colab for several good reasons.
对于机器学习任务,很长一段时间以来,我曾经通过Anaconda发行版使用-iPython- Jupyter Notebook几乎专门用于模型构建,训练和测试。 不过,最近,出于几个很好的原因,我改用了Google Colab。
Google Colab offers several additional features on top of the Jupyter Notebook such as (i) collaboration with other developers, (ii) cloud-based hosting, and (iii) GPU & TPU accelerated training. You can do all these with the free version of Google Colab. The relationship between Python, Jupyter Notebook, and Google Colab can be visualized as follows:
Google Colab在Jupyter Notebook之上提供了其他一些功能,例如(i)与其他开发人员的协作,(ii)基于云的托管以及(iii)GPU和TPU加速培训。 您可以使用免费版本的Google Colab来完成所有这些操作。 Python,Jupyter Notebook和Google Colab之间的关系可以如下所示:
Figure 6. Relationship between iPython, Jupyter Notebook, Google Colab | (Graph by author) 图6. iPython,Jupyter Notebook,Google Colab之间的关系| (作者图表)Anaconda provides free and open-source distribution of the Python and R programming languages for scientific computing with tools like Jupyter Notebook (iPython) or Jupyter Lab. On top of these tools, Google Colab lets its users use the iPython notebook and lab tools with the computing power of their servers.
Anaconda使用Jupyter Notebook(iPython)或Jupyter Lab等工具为科学计算提供了免费和开源的Python和R编程语言分发。 除了这些工具之外,Google Colab还允许其用户使用iPython笔记本和实验室工具来发挥其服务器的计算能力。
Now that we have a general understanding of generative adversarial networks as our neural network architecture and Google Collaboratory as our programming environment, we can start building our model. In this tutorial, we will do our own take from an official TensorFlow tutorial [7].
现在我们已经对生成对抗网络(作为我们的神经网络体系结构)和Google协作实验室(作为我们的编程环境)有了一般的了解,我们可以开始构建模型。 在本教程中,我们将从TensorFlow官方教程[ 7 ]中自己动手做。
Colab already has most machine learning libraries pre-installed, and therefore, you can just import them as shared below:
Colab已经预先安装了大多数机器学习库,因此,您可以按以下共享方式导入它们:
import tensorflow as tf from tensorflow.keras.layers import (Dense, BatchNormalization, LeakyReLU, Reshape, Conv2DTranspose, Conv2D, Dropout, Flatten) import matplotlib.pyplot as pltFor the sake of shorter code, I prefer to import layers individually, as shown above.
为了简化代码,我更喜欢分别导入图层,如上所示。
For this tutorial, we can use the MNIST dataset. The MNIST dataset contains 60,000 training images and 10,000 testing images taken from American Census Bureau employees and American high school students [8].
对于本教程,我们可以使用MNIST数据集。 MNIST数据集包含60,000张训练图像和10,000张测试图像,这些图像是从美国人口普查局员工和美国高中学生那里拍摄的[ 8 ]。
Luckily we may directly retrieve the MNIST dataset from the TensorFlow library. We retrieve the dataset from Tensorflow because this way, we can have the already processed version of it. We still need to do a few preparation and processing works to fit our data into the GAN model. Therefore, in the second line, we separate these two groups as train and test and also separated the labels and the images.
幸运的是,我们可以直接从TensorFlow库中检索MNIST数据集。 我们从Tensorflow检索数据集,因为通过这种方式,我们可以获得已经处理过的版本。 我们仍然需要做一些准备和处理工作,以使我们的数据适合GAN模型。 因此,在第二行中,我们将这两个组分开进行训练和测试,并且还将标签和图像分开。
x_train and x_test parts contain greyscale RGB codes (from 0 to 255) while y_train and y_test parts contain labels from 0 to 9 which represents which number they actually are. Since we are doing an unsupervised learning task, we will not need label values and therefore, we use underscores (i.e., _) to ignore them. We also need to convert our dataset to 4-dimensions with the reshape function. Finally, we convert our NumPy array to a TensorFlow Dataset object for more efficient training. The lines below do all these tasks:
x_train和x_test部分包含灰度RGB代码(从0到255),而y_train和y_test部分包含从0到9的标签,这些标签代表它们实际的编号。 由于我们正在执行无监督的学习任务,因此我们不需要标签值,因此,我们使用下划线(即_)来忽略它们。 我们还需要使用reshape函数将数据集转换为4维。 最后,我们将NumPy数组转换为TensorFlow Dataset对象,以进行更有效的训练。 下面的行完成所有这些任务:
# underscore to omit the label arrays (train_images, train_labels), (_, _) = tf.keras.datasets.mnist.load_data() train_images = train_images.reshape(train_images.shape[0], 28, 28, 1).astype('float32') train_images = (train_images - 127.5) / 127.5 # Normalize the images to [-1, 1] BUFFER_SIZE = 60000 BATCH_SIZE = 256 # Batch and shuffle the data train_dataset = tf.data.Dataset.from_tensor_slices(train_images).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)Our data is already processed and it is time to build our GAN model.
我们的数据已经处理完毕,现在该建立GAN模型了。
As mentioned above, every GAN must have at least one generator and one discriminator. Since we are dealing with image data, we need to benefit from Convolution and Transposed Convolution (Inverse Convolution) layers in these networks. Let's define our generator and discriminator networks below.
如上所述,每个GAN必须至少具有一个生成器和一个鉴别器。 由于我们正在处理图像数据,因此需要从这些网络中的卷积和转置卷积(逆卷积)层中受益。 让我们在下面定义我们的生成器和鉴别器网络。
Our generator network is responsible for generating 28x28 pixels grayscale fake images from random noise. Therefore, it needs to accept 1-dimensional arrays and output 28x28 pixels images. For this task, we need Transposed Convolution layers after reshaping our 1-dimensional array to a 2-dimensional array. Transposed Convolution layers can increase the size of a smaller array. We also take advantage of BatchNormalization and LeakyReLU layers. The below lines create a function which would generate a generator network with Keras Sequential API:
我们的生成器网络负责根据随机噪声生成28x28像素的灰度假图像。 因此,它需要接受一维数组并输出28x28像素的图像。 为此,在将一维数组重塑为二维数组后,需要转置卷积层。 转置的卷积层可以增加较小阵列的大小。 我们还利用了BatchNormalization和LeakyReLU层。 以下几行创建了一个函数,该函数将使用Keras Sequential API生成发电机网络:
def make_generator_model(): model = tf.keras.Sequential() model.add(Dense(7*7*256, use_bias=False, input_shape=(100,))) model.add(BatchNormalization()) model.add(LeakyReLU()) model.add(Reshape((7, 7, 256))) assert model.output_shape == (None, 7, 7, 256) # Note: None is the batch size model.add(Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False)) assert model.output_shape == (None, 7, 7, 128) model.add(BatchNormalization()) model.add(LeakyReLU()) model.add(Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False)) assert model.output_shape == (None, 14, 14, 64) model.add(BatchNormalization()) model.add(LeakyReLU()) model.add(Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh')) assert model.output_shape == (None, 28, 28, 1) return modelWe can call our generator function with the following code:
我们可以使用以下代码调用生成器函数:
generator = make_generator_model() Figure 7. The Summary of Our Generator Network | (Graph by Author) 图7.我们的发电机网络摘要| (作者图表)Now that we have our generator network, we can easily generate a sample image with the following code:
现在我们有了生成器网络,我们可以使用以下代码轻松生成示例图像:
# Create a random noise and generate a sample noise = tf.random.normal([1, 100]) generated_image = generator(noise, training=False) # Visualize the generated sample plt.imshow(generated_image[0, :, :, 0], cmap='gray')which would look like this:
看起来像这样:
Figure 8. A Sample Image Generated by Non-Trained Generator Network | (Image by author) 图8.非训练发电机网络生成的样本图像(图片由作者提供)It is just plain noise. But, the fact that it can create an image from a random noise array proves its potential.
这只是普通的声音。 但是,它可以从随机噪声阵列生成图像的事实证明了它的潜力。
For our discriminator network, we need to follow the inverse version of our generator network. It takes the 28x28 pixels image data and outputs a single value, representing the possibility of authenticity. So, our discriminator can review whether a sample image generated by the generator is fake.
对于鉴别器网络,我们需要遵循生成器网络的逆版本。 它获取28x28像素的图像数据并输出一个单一值,表示真实性的可能性。 因此,我们的鉴别器可以检查由生成器生成的样本图像是否为伪造的。
We follow the same method that we used to create a generator network, The following lines create a function that would create a discriminator model using Keras Sequential API:
我们遵循与创建发电机网络相同的方法。以下几行创建了一个函数,该函数将使用Keras Sequential API创建鉴别器模型:
def make_discriminator_model(): model = tf.keras.Sequential() model.add(Conv2D(64, (5, 5), strides=(2, 2), padding='same', input_shape=[28, 28, 1])) model.add(LeakyReLU()) model.add(Dropout(0.3)) model.add(Conv2D(128, (5, 5), strides=(2, 2), padding='same')) model.add(LeakyReLU()) model.add(Dropout(0.3)) model.add(Flatten()) model.add(Dense(1)) return modelWe can call the function to create our discriminator network with the following line:
我们可以使用以下行调用该函数来创建我们的鉴别器网络:
discriminator = make_discriminator_model() Figure 9. The Summary of Our Discriminator Network | (Graph by author) 图9.鉴别器网络摘要| (作者图表)Finally, we can check what our non-trained discriminator says about the sample generated by the non-trained generator:
最后,我们可以检查非训练鉴别器对非训练生成器生成的样本的评价:
decision = discriminator(generated_image) print (decision)Output: tf.Tensor([[-0.00108097]], shape=(1, 1), dtype=float32)
输出: tf.Tensor([[- 0.00108097 ]],shape =(1,1),dtype = float32)
A negative value shows that our non-trained discriminator concludes that the image sample in Figure 8 is fake. At the moment, what's important is that it can examine images and provide results, and the results will be much more reliable after training.
负值表示我们的未经训练的判别器得出的结论是,图8中的图像样本是伪造的。 目前,重要的是它可以检查图像并提供结果,训练后结果将更加可靠。
Since we are training two sub-networks inside a GAN network, we need to define two loss functions and two optimizers.
由于我们正在训练GAN网络中的两个子网,因此我们需要定义两个损失函数和两个优化器。
Loss Functions: We start by creating a Binary Crossentropy object from tf.keras.losses module. We also set the from_logits parameter to True. After creating the object, we fill them with custom discriminator and generator loss functions. Our discriminator loss is calculated as a combination of (i) the discriminator’s predictions on real images to an array of ones and (ii) its predictions on generated images to an array of zeros. Our generator loss is calculated by measuring how well it was able to trick the discriminator. Therefore, we need to compare the discriminator’s decisions on the generated images to an array of 1s.
损失函数:我们首先从tf.keras.losses模块创建一个Binary Crossentropy对象。 我们还将from_logits参数设置为True 。 创建对象后,我们使用自定义鉴别符和生成器损失函数填充它们。 我们的鉴别器损失的计算方式是:(i)鉴别器对真实图像的预测为一的数组,以及(ii)对鉴别器对生成图像的预测为零的数组。 我们的发电机损耗是通过测量其欺骗鉴别器的能力来计算的。 因此,我们需要将鉴别器对生成图像的决策与1s数组进行比较。
Optimizers: We also set two optimizers separately for generator and discriminator networks. We can use the Adam optimizer object from tf.keras.optimizers module.
优化器:我们还分别为生成器和鉴别器网络设置了两个优化器。 我们可以使用tf.keras.optimizers模块中的Adam优化器对象。
The following lines configure our loss functions and optimizers
以下几行配置我们的损失函数和优化器
# This method returns a helper function to compute cross entropy loss cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True) def discriminator_loss(real_output, fake_output): real_loss = cross_entropy(tf.ones_like(real_output), real_output) fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output) total_loss = real_loss + fake_loss return total_loss def generator_loss(fake_output): return cross_entropy(tf.ones_like(fake_output), fake_output) generator_optimizer = tf.keras.optimizers.Adam(1e-4) discriminator_optimizer = tf.keras.optimizers.Adam(1e-4)We would like to have access to previous training steps and TensorFlow has an option for this: checkpoints. By setting a checkpoint directory, we can save our progress at every epoch. This will be especially useful when we restore our model from the last epoch. The following lines configure the training checkpoints by using the os library to set a path to save all the training steps
我们希望可以访问以前的培训步骤,并且TensorFlow为此提供了一个选项: checkpoints 。 通过设置检查点目录,我们可以在每个时期保存进度。 当我们从最后一个时期恢复模型时,这将特别有用。 以下各行通过使用os库来设置保存所有训练步骤的路径来配置训练检查点
import os checkpoint_dir = './training_checkpoints' checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt") checkpoint = tf.train.Checkpoint(generator_optimizer=generator_optimizer, discriminator_optimizer=discriminator_optimizer, generator=generator, discriminator=discriminator)Now our data ready, our model is created and configured. It is time to design our training loop. Note that at the moment, GANs require custom training loops and steps. I will try to make them as understandable as possible for you. Make sure that you read the code comments in the Github Gists.
现在我们的数据准备就绪,我们的模型已创建并配置。 现在是时候设计我们的训练循环了。 请注意,目前,GAN需要自定义训练循环和步骤。 我将尽力使它们对您来说尽可能地易于理解。 确保您阅读了Github Gists中的代码注释。
Let’s create some of the variables with the following lines:
让我们用以下几行创建一些变量:
EPOCHS = 60 # We will reuse this seed overtime (so it's easier) # to visualize progress in the animated GIF) num_examples_to_generate = 16 noise_dim = 100 seed = tf.random.normal([num_examples_to_generate, noise_dim])Our seed is the noise that we use to generate images on top of. The code below generates a random array with normal distribution with the shape (16, 100).
我们的种子是我们用来生成图像的噪声。 下面的代码生成一个形状为(16,100)且具有正态分布的随机数组。
This is the most unusual part of our tutorial: We are setting a custom training step. After defining the custom train_step() function by annotating the tf.function module, our model will be trained based on the custom train_step() function we defined.
这是本教程中最不寻常的部分:我们正在设置自定义培训步骤。 通过注释的tf.function模块定义自定义train_step()函数后,我们的模型将基于自定义train_step进行培训()函数中,我们定义。
The code below with excessive comments are for the training step. Please read the comments carefully:
以下带有过多注释的代码用于培训步骤。 请仔细阅读评论:
# tf.function annotation causes the function # to be "compiled" as part of the training @tf.function def train_step(images): # 1 - Create a random noise to feed it into the model # for the image generation noise = tf.random.normal([BATCH_SIZE, noise_dim]) # 2 - Generate images and calculate loss values # GradientTape method records operations for automatic differentiation. with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape: generated_images = generator(noise, training=True) real_output = discriminator(images, training=True) fake_output = discriminator(generated_images, training=True) gen_loss = generator_loss(fake_output) disc_loss = discriminator_loss(real_output, fake_output) # 3 - Calculate gradients using loss values and model variables # "gradient" method computes the gradient using # operations recorded in context of this tape (gen_tape and disc_tape). # It accepts a target (e.g., gen_loss) variable and # a source variable (e.g.,generator.trainable_variables) # target --> a list or nested structure of Tensors or Variables to be differentiated. # source --> a list or nested structure of Tensors or Variables. # target will be differentiated against elements in sources. # "gradient" method returns a list or nested structure of Tensors # (or IndexedSlices, or None), one for each element in sources. # Returned structure is the same as the structure of sources. gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables) gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables) # 4 - Process Gradients and Run the Optimizer # "apply_gradients" method processes aggregated gradients. # ex: optimizer.apply_gradients(zip(grads, vars)) """ Example use of apply_gradients: grads = tape.gradient(loss, vars) grads = tf.distribute.get_replica_context().all_reduce('sum', grads) # Processing aggregated gradients. optimizer.apply_gradients(zip(grads, vars), experimental_aggregate_gradients=False) """ generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables)) discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))Now that we created our custom training step with tf.function annotation, we can define our train function for the training loop.
现在,我们使用tf.function注释创建了自定义训练步骤,我们可以为训练循环定义训练功能。
We define a function, named train, for our training loop. Not only we run a for loop to iterate our custom training step over the MNIST, but also do the following with a single function:
我们为训练循环定义了一个名为train的函数。 我们不仅运行for循环来遍历MNIST上的自定义训练步骤,而且还可以通过单个函数执行以下操作:
During the Training:
培训期间:
Start recording time spent at the beginning of each epoch; 开始记录每个时期开始时所花费的时间; Produce GIF images and display them, 产生GIF图像并显示它们, Save the model every five epochs as a checkpoint, 每五个时期将模型保存为一个检查点, Print out the completed epoch time; and 打印出完整的纪元时间; 和 Generate a final image in the end after the training is completed.训练结束后最后生成最终图像。The following lines with detailed comments, do all these tasks:
下面的行带有详细的注释,可以完成所有这些任务:
import time from IPython import display # A command shell for interactive computing in Python. def train(dataset, epochs): # A. For each epoch, do the following: for epoch in range(epochs): start = time.time() # 1 - For each batch of the epoch, for image_batch in dataset: # 1.a - run the custom "train_step" function # we just declared above train_step(image_batch) # 2 - Produce images for the GIF as we go display.clear_output(wait=True) generate_and_save_images(generator, epoch + 1, seed) # 3 - Save the model every 5 epochs as # a checkpoint, which we will use later if (epoch + 1) % 5 == 0: checkpoint.save(file_prefix = checkpoint_prefix) # 4 - Print out the completed epoch no. and the time spent print ('Time for epoch {} is {} sec'.format(epoch + 1, time.time()-start)) # B. Generate a final image after the training is completed display.clear_output(wait=True) generate_and_save_images(generator, epochs, seed)In the train function, there is a custom image generation function that we haven’t defined yet. Our image generation function does the following tasks:
在train函数中,有一个尚未定义的自定义图像生成函数。 我们的图像生成功能执行以下任务:
Generate images by using the model; 使用模型生成图像; Display the generated images in a 4x4 grid layout using matplotlib; 使用matplotlib以4x4网格布局显示生成的图像; Save the final figure in the end 最后保存最终图形The following lines are in charge of these tasks:
以下行负责这些任务:
def generate_and_save_images(model, epoch, test_input): # Notice `training` is set to False. # This is so all layers run in inference mode (batchnorm). # 1 - Generate images predictions = model(test_input, training=False) # 2 - Plot the generated images fig = plt.figure(figsize=(4,4)) for i in range(predictions.shape[0]): plt.subplot(4, 4, i+1) plt.imshow(predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray') plt.axis('off') # 3 - Save the generated images plt.savefig('image_at_epoch_{:04d}.png'.format(epoch)) plt.show()After training three complex functions, starting the training is fairly easy. Just call the train function with the below arguments:
在训练了三个复杂的功能之后,开始训练非常容易。 只需使用以下参数调用train函数即可:
train(train_dataset, EPOCHS)If you use GPU enabled Google Colab notebook, the training will take around 10 minutes. If you are using CPU, it may take much more. Let's see our final product after 60 epochs.
如果您使用支持GPU的Google Colab笔记本,则培训大约需要10分钟。 如果您使用的是CPU,则可能会花费更多。 让我们看看60个时代之后的最终产品。
Figure 10. The Digits Generated by Our GAN after 60 Epochs. Note that we are seeing 16 samples because we configured our output this way. | (Image by author) 图10.我们的GAN在60个纪元后生成的数字。 请注意,由于我们以这种方式配置了输出,因此我们看到了16个样本。 | (图片由作者提供)Before generating new images, let's make sure we restore the values from the latest checkpoint with the following line:
在生成新图像之前,请确保使用以下行从最新的检查点还原值:
checkpoint.restore(tf.train.latest_checkpoint(checkpoint_dir))We can also view the evolution of our generative GAN model by viewing the generated 4x4 grid with 16 sample digits for any epoch with the following code:
我们还可以通过使用以下代码查看生成的4x4网格以及任意时期的16个样本数字,从而查看生成的GAN模型的演变情况:
# PIL is a library which may open different image file formats import PIL # Display a single image using the epoch number def display_image(epoch_no): return PIL.Image.open('image_at_epoch_{:04d}.png'.format(epoch_no)) display_image(EPOCHS)or better yet, let's create a GIF image visualizing the evolution of the samples generated by our GAN with the following code:
或更妙的是,让我们用以下代码创建一个GIF图像,以可视化由GAN生成的样本的演变:
import glob # The glob module is used for Unix style pathname pattern expansion. import imageio # The library that provides an easy interface to read and write a wide range of image data anim_file = 'dcgan.gif' with imageio.get_writer(anim_file, mode='I') as writer: filenames = glob.glob('image*.png') filenames = sorted(filenames) for filename in filenames: image = imageio.imread(filename) writer.append_data(image) # image = imageio.imread(filename) # writer.append_data(image) display.Image(open('dcgan.gif','rb').read())Our output is as follows:
我们的输出如下:
Figure 11. The GIF Image Showing the Evolution of our GAN Generated Sample Digits over Time | (Image by author) 图11. GIF图像显示了我们GAN生成的样本数字随时间的演变(图片由作者提供)As you can see in Figure 11, the outputs generated by our GAN becomes much more realistic over time.
如您在图11中所看到的,随着时间的流逝,我们的GAN生成的输出变得更加现实。
You have built and trained a generative adversarial network (GAN) model, which can successfully create handwritten digits. There are obviously some samples that are not very clear, but only for 60 epochs trained on only 60,000 samples, I would say that the results are very promising.
您已经建立并训练了生成对抗网络(GAN)模型,该模型可以成功创建手写数字。 显然有一些样本不是很清楚,但是仅在60,000个样本上训练了60个纪元,我想说结果是非常有希望的。
Once you can build and train this network, you can generate much more complex images,
建立并训练该网络后,您可以生成更复杂的图像,
by working with a larger dataset with colored images in high definition; 通过使用更大的数据集和高清彩色图像; by creating a more sophisticated discriminator and generator network; 通过创建更复杂的鉴别器和生成器网络; by increasing the number of epochs; 通过增加时期数; by working on a GPU-enabled powerful hardware 通过使用支持GPU的强大硬件In the end, you can create art pieces such as poems, paintings, text or realistic photos or videos.
最后,您可以创建艺术品,例如诗歌,绘画,文字或逼真的照片或视频。
If you would like to have access to full code on Google Colab and have access to my latest content, subscribe to the mailing list: ✉️
如果您想访问Google Colab上的完整代码并访问我的最新内容,请订阅邮件列表:✉️
演示地址
Slide to Subscribe 滑动以订阅[1] Orhan G. Yalcin, Image Classification in 10 Minutes with MNIST Dataset, Towards Data Science, https://towardsdatascience.com/image-classification-in-10-minutes-with-mnist-dataset-54c35b77a38d
[1] Orhan G. Yalcin,使用MNIST数据集在10分钟内进行图像分类,迈向数据科学, https: //towardsdatascience.com/image-classification-in-10-minutes-with-mnist-dataset-54c35b77a38d
[2] Lehtinen, J. (n.d.). Analyzing and Improving the Image Quality of StyleGAN Tero Karras NVIDIA Samuli Laine NVIDIA Miika Aittala NVIDIA Janne Hellsten NVIDIA. Retrieved from https://github.com/NVlabs/stylegan2
[2] Lehtinen,J.(nd)。 分析和改善Style的图像质量GAN Tero Karras NVIDIA Samuli Laine NVIDIA Miika Aittala NVIDIA Janne Hellsten NVIDIA 。 取自https://github.com/NVlabs/stylegan2
[3] Or Sharir & Ronen Tamari & Nadav Cohen & Amnon Shashua, Tensorial Mixture Models, https://www.researchgate.net/profile/Or_Sharir/publication/309131743.
[3]或Sharir和Ronen Tamari和Nadav Cohen和Amnon Shashua,张量混合模型, https: //www.researchgate.net/profile/Or_Sharir/publication/309131743。
[4] Wikipedia, File:Ian Goodfellow.jpg, https://upload.wikimedia.org/wikipedia/commons/f/fe/Ian_Goodfellow.jpg
[4]维基百科,文件:Ian Goodfellow.jpg, https ://upload.wikimedia.org/wikipedia/commons/f/fe/Ian_Goodfellow.jpg
SYNCED, Father of GANs Ian Goodfellow Splits Google For Apple, https://medium.com/syncedreview/father-of-gans-ian-goodfellow-splits-google-for-apple-279fcc54b328
SYNC的GAN父亲伊恩·古德费洛(Ian Goodfellow)拆分了Google for Apple, https: //medium.com/syncedreview/father-of-gans-ian-goodfellow-splits-google-for-apple-279fcc54b328
[5] YOUTUBE, Heroes of Deep Learning: Andrew Ng interviews Ian Goodfellow, https://www.youtube.com/watch?v=pWAc9B2zJS4
[5] YOUTUBE,《深度学习英雄》:吴安德(Andrew Ng)采访伊恩·古德费洛(Ian Goodfellow), https://www.youtube.com/watch?v = pWAc9B2zJS4
[6] George Lawton, Generative adversarial networks could be most powerful algorithm in AI, https://searchenterpriseai.techtarget.com/feature/Generative-adversarial-networks-could-be-most-powerful-algorithm-in-AI
[6]乔治·劳顿(George Lawton),生成对抗网络可能是AI中最强大的算法, https://searchenterpriseai.techtarget.com/feature/Generative-adversarial-networks-could-be-most-powerful-algorithm-in-AI
[7] Deep Convolutional Generative Adversarial Network, TensorFlow, available at https://www.tensorflow.org/tutorials/generative/dcgan
[7]深度卷积生成对抗网络,TensorFlow,网址为https://www.tensorflow.org/tutorials/generative/dcgan
[8] Wikipedia, MNIST database, https://en.wikipedia.org/wiki/MNIST_database
[8]维基百科,MNIST数据库, https: //en.wikipedia.org/wiki/MNIST_database
翻译自: https://towardsdatascience.com/image-generation-in-10-minutes-with-generative-adversarial-networks-c2afc56bfa3b
生成对抗网络 图像生成
相关资源:一种基于生成式对抗网络的图像风格迁移方法