Over the summer, I was given an opportunity through the Intel® Student Ambassador Program for Artificial Intelligence (AI) to work on an Early Innovation Project called Face It which is a mobile application that uses machine learning to help a user decide on what hairstyle to choose. Using the knowledge I gained from this project, my partner, Roshni Shah and I created a very similar application with a different use case during an internship at Rutgers Wireless Information Network Laboratory (WINLAB). Our project is called the ‘AI Calorie Counter’ and it is a diet application that helps a user keep track of what foods he/she eats and how many total daily calories he/she consumes.
To start using the application, the user must first create an account that is saved to a database. When creating the account, the user will have to input various information including their age, activity level, current weight and goal weight. This information is used to calculate a specific amount of daily calories the user would need to consume in order to lose weight or gain weight depending on his/her goal.
After the user creates his/her account he/she can access the rest of the features of the application which includes the food scanner and the food log. To use the food scanner the user would click on the ‘camera’ button and hold a food item that he/she is about to consume in front of the mobile device’s camera. The food scanner would then recognize the food item that is being presented to it and display the amount of calories within that food item. After seeing the amount of calories in the food item, depending on whether or not the user still chooses to eat the item he/she can log the food item and calorie amount into the food log feature of the application. On the food log screen the user can view his/her total daily calories and whenever a new food item is added not only would the name of the food item and the amount of calories within the food item be recorded through the database, but the amount of calories within that food item would also be subtracted from the user’s total daily calories. The user’s leftover calories would be displayed on the food log screen as well.
Using our application, the user can not only detect and view how many calories are in a certain food item, but he/she can also keep track of his/her daily caloric intake and live a healthier life in general.
A key component of the application is the food recognition feature of it which is done using machine learning. A convolutional neural network (CNN) is specifically used to complete this task. We chose a convolutional neural network because the architecture of a CNN is the best for image recognition tasks. CNN architectures are inspired by biological processes and include variations of multilayer preceptors that result in minimal amounts of preprocessing. In a CNN, there are multiple layers that each have distinct functions to help us recognize an image. These layers include a convolutional layer, pooling layer, ReLU layer, fully connected layer and loss layer.
- The convolutional layer acts as the core of any CNN. The network of a CNN develops a 2-dimensional activation map that detects the special position of a feature at all the given spatial positions which are set by the parameters.
- The pooling layer acts as a form of down sampling. Max Pooling is the most common implementation of pooling. Max Pooling is ideal when dealing with smaller data sets which is why we are choosing to use it
- The ReLU layer or the Rectified Linear Units layer is a layer of neurons which applies an activation function to increase the nonlinear properties of the decision function and of the overall network without affecting the receptive fields of the convolutional layer itself.
- The Fully Connected Layer, which occurs after several convolutional and max pooling layers, does the high-level reasoning in the neural network. Neurons in this layer have connections to all the activations amongst the precious layers. After, the activations for the Fully Connected layer are computed by a matrix multiplication and a bias offset.
- The Loss layer specifies how the network training penalizes the deviation between the predicted and true layers. We believe that Softmax Loss is the best for our project as this is ideal for detecting a single class in a set of mutually exclusive classes.
For the dataset approximately 50 images of each food item was collected to be passed through the CNN. These images consisted of the food item appearing in various sizes and orientations so that it would be able to recognize a new image of the food item no matter what angle it was presented at.
There are a lot of improvements to make on this application and we have a lot of future plans. One feature we would like to improve upon is the calorie detector. Currently the number of calories being displayed is only the average amount of calories for the given food item. This method is not very accurate because a large apple would obviously contain more calories than a small apple. One way to improve this issue is by detecting the volume of the presented food item. There is a method called ‘space carving’ that detects the volume of 3D objects that we would like to look into and possibly implement in the future. We would also like to increase the number of food items that the food scanner can recognize so that people can use this application for any meal, common or exotic. Currently, the CNN model is only trained on 15 common food items but expanding this list is definitely something we would like to do. The last major improvement we would like to possibly add is a fitness aspect to our application where the user will be recommended certain daily exercises that he/she can perform and where the user can keep track of how many calories are being lost from each exercise.
Currently you can view more details about this project on its here and you can view a video of the working application here. This project is open-source and if anyone is interested in playing around with the code or helping us implement one of our future improvements that we would like to make, feel free to download the source code on GitHub here.