Vision based facial emotion detection using deep convolutional neural networks

Detta är en Kandidat-uppsats från Mälardalens högskola/Akademin för innovation, design och teknik

Sammanfattning: Emotion detection, also known as Facial expression recognition, is the art of mapping an emotion to some sort of input data taken from a human. This is a powerful tool to extract valuable information from individuals which can be used as data for many different purposes, ranging from medical conditions such as depression to customer feedback. To be able to solve the problem of facial expression recognition, smaller subtasks are required and all of them together form the complete system to the problem. Breaking down the bigger task at hand, one can think of these smaller subtasks in the form of a pipeline that implements the necessary steps for classification of some input to then give an output in the form of emotion. In recent time with the rise of the art of computer vision, images are often used as input for these systems and have shown great promise to assist in the task of facial expression recognition as the human face conveys the subjects emotional state and contain more information than other inputs, such as text or audio. Many of the current state-of-the-art systems utilize computer vision in combination with another rising field, namely AI, or more specifically deep learning. These proposed methods for deep learning are in many cases using a special form of neural network called convolutional neural network that specializes in extracting information from images. Then performing classification using the SoftMax function, acting as the last part before the output in the facial expression pipeline. This thesis work has explored these methods of utilizing convolutional neural networks to extract information from images and builds upon it by exploring a set of machine learning algorithms that replace the more commonly used SoftMax function as a classifier, in attempts to further increase not only the accuracy but also optimize the use of computational resources. The work also explores different techniques for the face detection subtask in the pipeline by comparing two approaches. One of these approaches is more frequently used in the state-of-the-art and is said to be more viable for possible real-time applications, namely the Viola-Jones algorithm. The other is a deep learning approach using a state-of-the-art convolutional neural network to perform the detection, in many cases speculated to be too computationally intense to run in real-time. By applying a state-of-the-art inspired new developed convolutional neural network together with the SoftMax classifier, the final performance did not reach state-of-the-art accuracy. However, the machine-learning classifiers used shows promise and bypass the SoftMax function in performance in several cases when given a massively smaller number of samples as training. Furthermore, the results given from implementing and testing a pure deep learning approach, using deep learning algorithms for both the detection and classification stages of the pipeline, shows that deep learning might outperform the classic Viola-Jones algorithm in terms of both detection rate and frames per second. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)