cnn实现情感识别
Note: This is a long post to read to coverup everything. So don’t get frustrated :)
注意:这是一篇冗长的文章,旨在掩盖所有内容。 所以不要沮丧:)
Hello Folks,
大家好,
As we know, emotions play a vital role in our life. We need a system that customizes its actions based on our behavior and emotions.
众所周知,情感在我们的生活中起着至关重要的作用。 我们需要一个能够根据我们的行为和情感来自定义其行为的系统。
Major tech giant companies like Google, Microsoft, Apple are trying to make their virtual assistants like Siri, Google Assistant, Alexa appear more like humans. These companies are doing great research and development to humanize their AI functionality of their virtual assistants. The idea is to incorporate digital assistants with a psychological machine learning model that is capable of detecting human facial emotions and acts accordingly.
像Google,Microsoft,Apple这样的大型科技巨头公司都在努力使Siri,Google Assistant,Alexa等虚拟助手看起来更像人类。 这些公司正在进行出色的研究和开发,以使其虚拟助手的AI功能人性化。 这个想法是将数字助理与心理机器学习模型结合在一起,该模型能够检测人的面部表情并相应地采取行动。
So It inspires me to do this project.
因此,这启发了我去做这个项目。
Download the data set from the official Kaggle website from the link below :
从以下链接从Kaggle官方网站下载数据集:
The data consists of 48*48 pixel grayscale images of faces. 数据由人脸的48 * 48像素灰度图像组成。 The task is to categorize each face based on the emotion shown in the facial expression into one of seven categories (0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral). 任务是根据面部表情中显示的情感将每个面Kong分类为七个类别之一(0 =愤怒,1 =恶心,2 =恐惧,3 =快乐,4 =悲伤,5 =惊奇,6 =中性) 。 The data set contains two columns, “emotion” and “pixels”. 数据集包含两列,“情感”和“像素”。 The “emotion” column contains a numeric code ranging from 0 to 6, inclusive, for the emotion that is present in the image. “情感”列包含图像中存在的情感的数字代码,范围从0到6(含)。 The “pixels” column contains a string surrounded in quotes for each image. The contents of this string a space-separated pixel values in row-major order. test.csv contains only the “pixels” column and your task is to predict the emotion column. “像素”列包含每个图像中用引号引起来的字符串。 该字符串的内容以行为主的顺序以空格分隔像素值。 test.csv仅包含“像素”列,您的任务是预测情感列。Data preprocessing is one of the important steps in the machine learning pipeline.
数据预处理是机器学习流程中的重要步骤之一。
load the pixels CSV of the file into a data frame.
将文件的像素CSV加载到数据框中。
#loading into a dataframe df = pd.read_csv("/root/.kaggle/fer2013/fer2013.csv") df.head()Apply image preprocessing techniques such as resize, reshape, convert into greyscale, and normalization.
应用图像预处理技术,例如调整大小,调整形状,转换为灰度和归一化。
Use the power of vectorization by converting images into NumPy arrays and pandas data frame whenever it’s necessary.
在必要时将图像转换为NumPy数组和熊猫数据帧,从而利用矢量化的强大功能。
Convert the images into NumPy arrays using OpenCV and make the output as categorical using pandas.
使用OpenCV将图像转换为NumPy数组,并使用pandas将输出分类。
import cv2 image_size=(48,48) pixels = df['pixels'].tolist() # Converting the relevant column element into a list for each row width, height = 48, 48 faces = [] for pixel_sequence in pixels: face = [int(pixel) for pixel in pixel_sequence.split(' ')] # Splitting the string by space character as a list face = np.asarray(face).reshape(width, height) #converting the list to numpy array in size of 48*48 face = cv2.resize(face.astype('uint8'),image_size) #resize the image to have 48 cols (width) and 48 rows (height) faces.append(face.astype('float32')) #makes the list of each images of 48*48 and their pixels in numpyarray form faces = np.asarray(faces) #converting the list into numpy array faces = np.expand_dims(faces, -1) #Expand the shape of an array -1=last dimension => means color space emotions = pd.get_dummies(df['emotion']).to_numpy() #doing the one hot encoding type on emotionsapplying normalization to speedup convergence
将规范化应用于加速收敛
x = faces.astype('float32') x = x / 255.0 #Dividing the pixels by 255 for normalization => range(0,1) # Scaling the pixels value in range(-1,1) x = x - 0.5 x = x * 2.0Split the data set into the train and validation set. So that we can check whether the model is overfitted to the training dataset or not using the validation dataset.
将数据集拆分为训练集和验证集。 这样我们就可以使用验证数据集检查模型是否过度拟合到训练数据集。
num_samples, num_classes = emotions.shape num_samples = len(x) num_train_samples = int((1 - 0.2)*num_samples) # Traning data train_x = x[:num_train_samples] train_y = emotions[:num_train_samples] # Validation data val_x = x[num_train_samples:] val_y = emotions[num_train_samples:] train_data = (train_x, train_y) val_data = (val_x, val_y)A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm that can take in an input image, assign importance (learnable weights and biases) to various aspects/objects in the image, and be able to differentiate one from the other. The pre-processing required in a ConvNet is much lower as compared to other classification algorithms. While in primitive methods filters are hand-engineered, with enough training, ConvNets can learn these filters/characteristics.
卷积神经网络(ConvNet / CNN)是一种深度学习算法,可以吸收输入图像,为图像中的各个方面/对象分配重要性(可学习的权重和偏差),并能够区分彼此。 与其他分类算法相比,ConvNet中所需的预处理要低得多。 在原始方法中,过滤器是手工设计的,经过足够的培训,ConvNets可以学习这些过滤器/特征。
Right now, all you need to know that a Convolution Neural Network or CNN as it is popularly called is a collection of mainly two types of layers-
现在,您需要知道的是,通常被称为卷积神经网络或CNN的主要是两种类型的层的集合:
The hidden layers / Feature Extraction Part 隐藏层/特征提取部分 convolutions 卷积 pooling 汇集2. The classifier part
2.分类器部分
Here I am using Keras with TensorFlow as the back-end for building Neural Networks. Further generalization, We have following layers in the keras to be added are
在这里,我将Keras与TensorFlow一起用作构建神经网络的后端。 进一步概括,我们在keras中添加以下几层:
Convolution layer 卷积层 Pooling layer 汇聚层 Batch normalization 批量标准化 Activation Layer 激活层 Dropout Layer 辍学层 Flatten Layer 展平层 Dense layer 致密层First import all dependencies:
首先导入所有依赖项 :
#load the libaray to built the model from keras.layers import Activation, Convolution2D, Dropout, Conv2D from keras.layers import AveragePooling2D, BatchNormalization from keras.layers import GlobalAveragePooling2D from keras.models import Sequential from keras.layers import Flatten from keras.models import Model from keras.layers import Input from keras.layers import MaxPooling2D from keras.layers import SeparableConv2D from keras import layers from keras.regularizers import l2Keras is a powerful deep learning library that is a high-level API for TensorFlow and acts as a wrapper for simplifications and abstract representations of neural networks.
Keras是一个功能强大的深度学习库,是TensorFlow的高级API,并充当神经网络的简化和抽象表示的包装器。
""" Building up Model Architecture """ model = Sequential() model.add(Convolution2D(filters=16, kernel_size=(7, 7), padding='same', name='image_array', input_shape=input_shape)) model.add(BatchNormalization()) model.add(Convolution2D(filters=16, kernel_size=(7, 7), padding='same')) model.add(BatchNormalization()) model.add(Activation('relu')) model.add(AveragePooling2D(pool_size=(2, 2), padding='same')) model.add(Dropout(.5)) model.add(Convolution2D(filters=32, kernel_size=(5, 5), padding='same')) model.add(BatchNormalization()) model.add(Convolution2D(filters=32, kernel_size=(5, 5), padding='same')) model.add(BatchNormalization()) model.add(Activation('relu')) model.add(AveragePooling2D(pool_size=(2, 2), padding='same')) model.add(Dropout(.5)) model.add(Convolution2D(filters=64, kernel_size=(3, 3), padding='same')) model.add(BatchNormalization()) model.add(Convolution2D(filters=64, kernel_size=(3, 3), padding='same')) model.add(BatchNormalization()) model.add(Activation('relu')) model.add(AveragePooling2D(pool_size=(2, 2), padding='same')) model.add(Dropout(.5)) model.add(Convolution2D(filters=128, kernel_size=(3, 3), padding='same')) model.add(BatchNormalization()) model.add(Convolution2D(filters=128, kernel_size=(3, 3), padding='same')) model.add(BatchNormalization()) model.add(Activation('relu')) model.add(AveragePooling2D(pool_size=(2, 2), padding='same')) model.add(Dropout(.5)) model.add(Convolution2D(filters=256, kernel_size=(3, 3), padding='same')) model.add(BatchNormalization()) model.add(Convolution2D(filters=num_classes, kernel_size=(3, 3), padding='same')) model.add(GlobalAveragePooling2D()) model.add(Activation('softmax',name='predictions'))Implementing Convolutional neural network using Keras may involve following intuitions, insights and back scenes
使用Keras实施卷积神经网络可能涉及以下直觉,见解和背景
Convolution : A matrix multiplication with filter -> feature detector
卷积 :带滤波器的矩阵乘法->特征检测器
It involves the following components
它涉及以下组件
Convolution2d : used for filtering windows of 2 dimensional input-> if 1st layer = input_shape 卷积2d:用于过滤二维输入的窗口->如果第一层= input_shape filters | windows; 2 dim filter | sliding window 过滤器| 视窗; 2暗淡滤镜| 滑动窗口 the sliding window slides over each channel and summarising the features. 滑动窗口在每个通道上滑动并汇总功能。batch normalization:
批量标准化:
It is used to stabilize perhaps accelerate the learning process by standardizing layer inputs.
它用于通过标准化层输入来稳定或加速学习过程。
It is done by applying transformation that maintains mean activation close to 0 and activation standard deviation(square root of variance — -> how far) close to 1
这是通过应用变换来完成的,该变换将平均激活保持在接近0并将激活标准偏差(方差的平方根-->多远)保持在接近1。
Below are the intuitive points and insights related to batch normalization
以下是与批处理规范化有关的直观点和见解
normalization => process tends to follow bell shape curve known as a normal distribution 正态化=>过程倾向于遵循钟形曲线,即正态分布 backpropagation => updated layer by layer backward from output to the input using an estimate of error that assumes weights in the layer before the current layer is fixed. backpropagation =>使用误差估计从输出到输入逐层更新,该误差估计假定当前层固定之前该层中的权重。 the gradient tells how to update each parameter under the assumption that other layers do not change. 梯度告诉我们在其他层不变的情况下如何更新每个参数。 all layers change during an update → this update procedure leads to forever chasing a moving target. 所有层在更新过程中都会更改→此更新过程会导致永远追逐移动的目标。 batch normalization => technique to coordinate the update of multiple layers in the model => reparametrization of network 批量归一化=>协调模型中多层更新的技术=>网络的重新参数化 Its all about standardize the mean and variance of each unit in normal dist 它全部关于标准化正态分布中每个单位的均值和方差 it’s all about standardizing inputs to layers for each mini-batch 都是为了标准化每个迷你批的层输入Activation(ReLu): activation layer( non linear layer)
Activation(ReLu) :激活层(非线性层)
a convention is to apply after conv layer 约定是在转换层之后应用 to introduce non-linearity to a system that has computed linear operations in Conv 向已在Conv中计算出线性运算的系统引入非线性 The rectified linear unit is widely used than non-linear functions(sigmoid, tanh) for its fast training without accuracy. 整流线性单元比非线性函数(Sigmoid,tanh)被广泛使用,因为它的快速训练没有准确性。 Relu : max(0,x) Relu:最大值(0,x) Relu also alleviates the vanishing gradient (lower layers of network trains very slowly because the gradient decreases slowly through layers.) Relu还缓解了消失的梯度(网络的较低层训练非常缓慢,因为梯度逐渐降低)。 without these non-linear functions(activation functions), the network would be a large linear classifier that could be simplified by multiplying weight matrices(accounting for bais). It wouldn’t do anything interesting such as image classification etc.. 如果没有这些非线性函数(激活函数),则网络将是一个大型线性分类器,可以通过乘以权重矩阵(占bais)来简化。 它不会做任何有趣的事情,例如图像分类等。Pooling: involves following types — max, avg ,global max, globalavg
池化:涉及以下类型-最大,平均,全局最大,全局平均
order : conv > activation > pooling 顺序:转换>激活>池化 To reduce dimensions of feature map by reducing parameters to learn and amount of computations 通过减少要学习的参数和计算量来减少特征图的尺寸 it further summarizes the feature map instead of precisely positioned features generated by conv layer. This makes the model more robust to variations in the position of features in image 它进一步总结了特征图,而不是由转换层生成的精确定位的特征。 这使模型对图像中特征位置的变化更加稳健 when network wants to detect higher-level features from low-level building blocks (detecting corners from edges). we don’t need to be rigid about the exact position . we need translational invariance at the feature level . so insert pooling 当网络要从低层构建块中检测高层特征时(从边缘检测角)。 我们不必对确切的位置持严格的态度。 我们需要特征层次的平移不变性。 所以插入池 It overcomes the problem of sensitivity to the location of the features. 它克服了对特征位置敏感的问题。 local translation invariance 局部翻译不变性Dropout: is a regularization technique
辍学 :是一种正则化技术
neurons are randomly dropped while training 训练时神经元随机掉落 this effect makes the network less sensitive to the specific weights of neurons 这种作用使网络对神经元的特定权重不太敏感 better generalization — less overfit 更好的泛化-减少过度拟合model.summary(): gives the information about the architecture and configuration of the neural network.
model.summary() :给出有关神经网络的体系结构和配置的信息。
Data Augmentation:
数据扩充:
Neural networks are data hunger. It’s best practice to use a large dataset since it reduces overfit and more chance to generalization.
神经网络是对数据的渴望。 最佳做法是使用大型数据集,因为它可以减少过度拟合的情况,并有更多的机会进行概括。
So we use data augmentation using ImageDataGenerator() in keras
所以我们在keras中使用ImageDataGenerator()进行数据增强
""" Data Augmentation => taking the batch and apply some series of random transformations (random rotation, resizing, shearing) ===> to increase generalizability of model """ # data generator Generate batches of tensor image data with real-time data augmentation data_generator = ImageDataGenerator( featurewise_center=False, featurewise_std_normalization=False, rotation_range=10, width_shift_range=0.1, height_shift_range=0.1, zoom_range=.1, horizontal_flip=True)Here is a brief summary of ImageDataGenerator() and it’s parameters and usage.
这是ImageDataGenerator()的简短摘要,以及它的参数和用法。
horizontal and vertical shift:
水平和垂直移位 :
for moving all pixels of an image in one direction either vertically or horizontally
用于沿一个方向垂直或水平移动图像的所有像素
width_shift_range ( horizontal shift) width_shift_range(水平移位) height_shift_range (vertical shift) height_shift_range(垂直移位) floating num [0- 1] → % of shift 浮点数[0-1]→转换百分比horizontal and vertical flips augmentation:for reversing rows or cols of pixels → True or False
水平和垂直翻转增强 :用于反转像素的行或列→True或False
Random rotation → from 0–360 degrees
随机旋转 →0-360度
if rotation_range = 90 ==> means random rotation to image is between 0 and 90 degrees
如果rotation_range = 90 ==>表示图像的随机旋转在0到90度之间
random brightness:it randomly darkens or brightens images. if brightness_range =[0.2,1.0] → means darkens or brightens if pixel is between 0.2 and 1
随机亮度 :随机使图像变暗或变亮。 如果Brightness_range = [0.2,1.0]→表示如果像素在0.2和1之间,则变暗或变亮
random zoom : either adds pixel or subtract pixels in image.
随机缩放 :在图像中增加或减少像素。
[1-value, 1+value] [1值,1 +值] for example , if zoom_range = .3 → means range [0.7, 1.3] or between 70%(zoom in) and 130% (zoom out) 例如,如果zoom_range = .3→表示范围[0.7,1.3]或介于70%(放大)和130%(缩小)之间when an object is created using following arguments. an iterator can be created for an image dataset.
使用以下参数创建对象时。 可以为图像数据集创建一个迭代器。
it iterates through all images in memory → use obj.flow(X,y) 它遍历内存中的所有图像→使用obj.flow(X,y) to iterates images through subdirectories → use obj.flow_from_directory(X,y,..) 通过子目录迭代图像→使用obj.flow_from_directory(X,y,..) for training ==> fit_generator() 用于训练==> fit_generator()It’s time to set the configuration for our neural network :)
是时候为我们的神经网络设置配置了:)
# model parameters/compilation """ CONFIGURATION ==>.compile(optimizer, loss , metrics) """ model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) model.summary()Now its time to train our model. Training is nothing but a learning loop. here we define hyperparameters such as the number of epochs, batch size, learning rate etc..The only way to find the best parameters is by trying.
现在是时候训练我们的模型了。 培训不过是一个学习循环。 在这里我们定义了超参数,例如时期数,批处理大小,学习率等。找到最佳参数的唯一方法是尝试。
#import from keras.callbacks import CSVLogger,ModelCheckpoint,EarlyStopping from keras.callbacks import ReduceLROnPlateau # parameters batch_size = 32 #Number of samples per gradient update num_epochs = 200 # Number of epochs to train the model. #input_shape = (64, 64, 1) verbose = 1 #per epohs progress bar num_classes = 7 patience = 50 datasets = ['fer2013'] num_epochs = 200 base_path="/content" for dataset_name in datasets: print('Training dataset:', dataset_name) #callbacks log_file_path = dataset_name + '_emotion_training.log' csv_logger = CSVLogger(log_file_path, append=False) early_stop = EarlyStopping('val_loss', patience=patience) reduce_lr = ReduceLROnPlateau('val_loss', factor=0.1,patience=int(patience/4), verbose=1) trained_models_path = base_path + dataset_name + 'simple_cnn' model_names = trained_models_path + '.{epoch:02d}-{val_loss:.2f}.hdf5' # if error "acc" in 1 line ... don't confuse check entire block since fit() generates a inner loop model_checkpoint = ModelCheckpoint(model_names, 'val_loss', verbose=1,save_best_only=True) my_callbacks = [model_checkpoint, csv_logger, early_stop, reduce_lr] # loading dataset train_faces, train_emotions = train_data history=model.fit_generator(data_generator.flow(train_faces, train_emotions, batch_size), epochs=num_epochs, verbose=1 ,callbacks=my_callbacks,validation_data =val_data) #not callbacks = [my_callbacks] since we my_callbacks is already a listwe use callbacks to record model performance while the model is learning.
我们在模型学习时使用回调来记录模型性能。
Here I want to explain everything in the code and also want to mention some of the insights briefly.
在这里,我想解释代码中的所有内容,还想简短地提及一些见解。
callback is an object that can perform actions at various stages of training
回调是可以在训练的各个阶段执行操作的对象
1. write tensorflowboard logs after every batch2. periodically save model to disk3. do early stopping 4. view on internal states and statistics during training* used in fit() loopCSVLogger(filename, separator=”,’) :
CSVLogger(文件名,分隔符=”,'):
is used to save epoch results to a csv file
用于将纪元结果保存到csv文件中
create obj and use that obj in fit(callbacks=[csv_logger_obj]) 创建obj并适合使用该obj(回调= [csv_logger_obj])EarlyStopping() :
EarlyStopping() :
is used to stop training when a monitored metric has stopped improving 用于在监控指标停止改进时停止训练 Below are the parameters used 以下是使用的参数 monitor = “val_loss” → loss function to be monitored monitor =“ val_loss”→要监视的损失函数 min_delta → minimum change to count(threshold) min_delta→最小变化数(阈值) patience → no of epochs with no improvement to stop training 耐心→没有改善停止训练的时代ReduceLROnPlateau()
ReduceLROnPlateau()
is used reduce learning rate when metric has stopped improving
用于在指标停止改善时降低学习率
Below are the parameters 以下是参数 monitor, patience, min_delta 监控器,耐心,最小增量 factor = 0.1 ==> learning rate reduced to 10% (lr*0.1) 系数= 0.1 ==>学习率降低到10%(lr * 0.1) verbose ==> 0: quiet , 1: update msgs 详细==> 0:安静,1:更新消息ModelCheckpoint() :
ModelCheckpoint() :
to save keras model or model weights at some frequency
以某种频率保存 keras模型或模型权重
Below are the parameters used 以下是使用的参数 filepath 文件路径 monitor → val_acc or val_loss 监视→val_acc或val_loss save_best_only = True save_best_only =真Its time to check how well the model learn the patterns from the training dataset.
是时候检查模型从训练数据集中学习模式的程度了。
#evaluate() returns [loss,acc] score = model.evaluate(val_x, val_y, verbose=1) print('Test loss:', score[0]) print('Test accuracy:', score[1]*100)Let’s see how the accuracy changes with epochs:- using history object:
让我们看看精度如何随着时间变化:-使用历史对象:
History is default callbacks that is registered when training
历史记录是训练时注册的默认回调
Records training metrics for each epoch 记录每个时期的训练指标 The history object is returned from calls to the fit() function used to train the model 历史记录对象从用于训练模型的fit()函数的调用返回 Metrics are stored in a dictionary in the history member of the object returned. 指标存储在返回对象的历史记录成员中的字典中。 """ metrics collected by history object """ history_dict=history.history history_dict.keys()For demo purpose, below i use accuracy record for 20 epochs
为了演示的目的,下面我使用20个时间段的精度记录
Visualizing model training history:
可视化模型训练历史记录:
Using matplotlib , let’svisualise the learning curve of the model.
使用matplotlib,让我们可视化模型的学习曲线。
Below is the code snippet, to plot loss vs number of epochs ..
以下是代码段,用于绘制损耗与历时数的关系图。
to visualize how the loss decreases as the number of epochs increases.
可视化损失随着时期数的增加而减少。
""" Visualising model training history """ import matplotlib.pyplot as plt train_loss_values = history_dict['loss'] val_loss_values = history_dict['val_loss'] epochs = range(1, len(history_dict['accuracy']) + 1) plt.plot(epochs, train_loss_values, 'bo', label='Training loss') plt.plot(epochs, val_loss_values, 'b', label='Validation loss') plt.title('Training and validation loss') plt.xlabel('Epochs') plt.ylabel('Loss') plt.legend() plt.show()Result:
结果:
Below is the code snippet, to plot accuracy vs number of epochs ..
下面是代码段,以绘制精度与历时数的关系..
To visualize how the accuracy increases as the number of epochs increases
可视化精度随着历元数的增加而增加
train_acc = history_dict['accuracy'] val_acc = history_dict['val_accuracy'] plt.plot(epochs, train_acc, 'bo', label='Training acc') plt.plot(epochs, val_acc, 'b', label='Validation acc') plt.title('Training and validation accuracy') plt.xlabel('Epochs') plt.ylabel('Loss') plt.legend() plt.show()Result:
结果:
To create a sense of feel, |I add emojis to the results.
为了营造一种感觉,|我在结果中添加了表情符号。
emotion_dict = {0: "Neutral", 1: "Disgusted", 2: "Fearful", 3: "Happy", 4: "Sad", 5: "Surprised", 6: "Neutral"} #emojis unicodes # emojis = { 0:"\U0001f620",1:"\U0001f922" ,2:"\U0001f628" ,3:"\U0001f60A" , 4:"\U0001f625" ,5:"\U0001f632",6:"\U0001f610" }I define all relevant emoji’s to the emotions in a dictionary . You can see the image below.
我在字典中定义了所有与表情相关的表情符号。 您可以看到下面的图像。
Now its time for predictions . Lets test the model with some images.
现在该进行预测了。 让我们用一些图像测试模型。
Before testing , we need to preprocess the testing image.
在测试之前,我们需要预处理测试图像。
import cv2 def _predict(path): facecasc = cv2.CascadeClassifier('/content/haarcascade_frontalface_default.xml') imagePath = '/content/'+path image = cv2.imread(imagePath) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) faces = facecasc.detectMultiScale(gray,scaleFactor=1.3, minNeighbors=10) print("No of faces : ",len(faces)) i = 1 for (x, y, w, h) in faces: i=i+1 cv2.rectangle(image, (x, y), (x+w, y+h), (0, 255, 0), 2) roi_gray = gray[y:y + h, x:x + w] #croping cropped_img = np.expand_dims(np.expand_dims(cv2.resize(roi_gray, (48, 48)), -1), 0) prediction = model.predict(cropped_img) maxindex = int(np.argmax(prediction)) print("person ",i," : ",emotion_dict[maxindex], "-->",emojis[maxindex]) cv2.putText(image, emotion_dict[maxindex], (x+10, y-20), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 255, 255), 2) #if text is not apeared , change coordinates. it may work cv2.imshow(image)I want add some things related to OpenCV here for better understanding.
我想在这里添加一些与OpenCV相关的内容,以便更好地理解。
With the help of OpenCV, we can easily preprocess images.
借助OpenCV,我们可以轻松地预处理图像。
Things that need to be observe in the above code for better intuition are:
为了更好地理解上面的代码,需要注意以下几点 :
np.argmax(predictions) →to get the maximum confidence prediction np.argmax(predictions)→获得最大的置信度预测 croping of image → use the power of slicing 裁剪图像→使用切片功能cascade classifier →ensemble learning based on concatenation of several classifiers
级联分类器 →基于多个分类器串联的集成学习
not multiexpert but multistage 不是多专家,而是多阶段 combinatorial nature of the classification 分类的组合性质 haar cascade classifier ==> highly pretrained models stored in XML files haar级联分类器==>高度预训练的模型,存储在XML文件中Here are some of the results.
这是一些结果。
predictions 预测 predictions 预测Now let us save the model. It involves both weights and architecture.
现在让我们保存模型。 它涉及权重和体系结构。
We don’t need to save the architecture but every time when we load model, we have to again define model architecture before we use it.
我们不需要保存架构,但是每次加载模型时,都必须在使用它之前再次定义模型架构。
So I prefer to save both (model into .h5 file and architecture into json file.)
所以我更喜欢将两者都保存(将模型保存到.h5文件中并将体系结构保存到json文件中。)
#saving weights model.save_weights("model.h5") #saving architecture model_json = model.to_json() with open("model.json", "w") as json_file: json_file.write(model_json) # serialize weights to HDF5 #model.save_weights("model.h5") print("Saved model to disk")I write function to load model. so we can load the model with ease.
我写函数加载模型。 因此我们可以轻松加载模型。
""" loading the model in modular approach """ def load_model_(): json_file = open('model.json', 'r') loaded_model_json = json_file.read() json_file.close() model = model_from_json(loaded_model_json) # load weights into new model model.load_weights("model.h5") return model model = load_model_()I think I document almost everything that needs to this project
我想我记录了该项目所需的几乎所有内容
The total source code with explanation is available at
带有说明的全部源代码可在以下位置获得:
This is the overall explanation of the project . Hope you all love this. If you have any queries feel free to comment below.
这是该项目的整体说明。 希望大家都喜欢。 如果您有任何疑问,请在下面评论。
For more projects, please
如需更多项目,请
prudhvignv.github.io
prudhvignv.github.io
Thank you!!..
谢谢!!..
翻译自: https://medium.com/@prudhvi.gnv/ultimate-guide-for-facial-emotion-recognition-using-a-cnn-f9239fdc63ad
cnn实现情感识别
相关资源:基于深度学习的人脸面部情感识别的研究_哈工大硕士论文2016