Face It is a mobile application that detects a person's facial structure as well as information about the person’s lifestyle and current trends, and utilizing that data, recommends the user a hair/beard style.
For this Early Innovation Project, our goal is to have the user scan his face to determine a face shape and then use this face shape along with other personal information such as information about the person’s hair and lifestyle to come up with personalized hair and beard style recommendations.
Over the first two weeks, we identified the various stages of development for the project. The stages include creating a user Interface design, a preference selection algorithm, a facial detection algorithm, and a trained convolutional neural network.
There is a User Interface building stage where our team will be using Android Studios, as we plan to make this an Android application. We have started the process of building the UI and are putting the final touches. Our UI will consist of three main screens: The face detection screen with the front-side camera, the preferences screen where the user input information about themselves and then, the final output screen which will have a list of recommended hair/ beard styles for the user. We will then integrate our UI with our code, which will consist of the facial shape detector and an algorithm to determine a matching set of hair/beard styles based on the person’s determined and selected preferences.
We will be asking the user various questions including the texture of their hair, the thickness of their hair, if they want to highlight their facial features, if they want facial hair and what kind of lifestyle they live. We will provide users with hair style recommendations based on each user’s specific answers. During our research, we realized that certain characteristics of an individual such as face shape, hair texture, hair thickness, and facial features are complementary to certain hair styles. At the moment, we are manually determining recommendations for every combination of characteristics. As the project progresses, we plan to create a neural network (CNN) which can read the latest fashion articles, understand them, and provide users personalized recommendations.
There is also the facial detection stage where we will be using the OpenCV library to recognize a person’s face in real-time. This will be done by using the Haar-Cascade Frontal Face Classifier along with the user’s front facing camera. A list of Haar-Cascade Classifiers can be found here:http://alereimondo.no-ip.org/OpenCV/34/. The algorithm for this is complete and will be integrated into the UI soon. As of right now, the app can detect a person’s face from a computer webcam. We need to migrate this to a mobile version and have the image of the face be sent to our CNN in real time.
After we detect a person’s face, we will have to use machine learning to determine the person’s face shape, to factor it into the final recommendations. We have been doing extensive research on machine learning functions and procedures. For detecting face shape, we will be using a CNN which is a subcategory of machine learning. A convolutional neural network, is an algorithm, that after being trained on a copious amount of images (in this case - face shapes), can see a brand new image and determine what type of image it is based on the trained dataset. This process occurs by having the algorithm break up the image into pixels and by creating various hidden layers by pooling the pixels together and using the dot-product to find distinct features in each image. The activation function used to perform this task is the Rectified Linear Unit or ReLu for short. In order for us to create this type of neural network, we will be using a Linux Operating System, specifically on Ubuntu, along with Python 3.5.2 and TensorFlow. We will also be using a Xeon Phi Cluster to efficiently train our dataset with images we will gather from Google.
While doing research we came across various examples that we would like to emulate. One being the CNN built for the MNIST dataset that detected handwritten numbers. In this example, the neural network was trained on various images and angles of handwritten numbers and it was successfully able to detect a new handwritten number that it was given. A neural network called AlexNet was awarded first place during the Large Scale Visual Recognition Challenge (ILSVRC) in 2010 for creating a very efficient neural network to do these identifications and we would like to do something similar. The papers and explanations for the AlexNet can be found here:http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf.
We have also done some experimentation with Google’s premade Inception v3 CNN, using a process called transfer learning where you basically just add and re-train the last layer of a pre-trained CNN. We used this model and re-trained the last layer to recognize face shapes and received an average accuracy of about 60%. This was a good way to test to see if this process is doable and it is! Using the data from the MNIST CNN and the Inception v3 model, we plan to create our model and trained dataset.
We’ve made good progress so far. Next we plan to create and train our own convolutional neural network using the Xeon Phi Cluster and combining it with our facial detector. We then plan to serve our neural network using TensorFlow’s serving guide. We also plan to have our UI fully designed and created as well.
We are very excited to work on this project for the next few weeks to see our idea come to life!
Continue with Week 4 Update