Drone Navigation in Semi-Cluttered Environment – Update 1

Live Video Stream Object Classification with Intel® Movidius™ NCS

The goal of this project is to use an Intel® Movidius Neural Compute Stick (NCS) for object classification in live video streams. The main purpose of NCS is eliminating the need for retraining a neural network for machine learning applications. In other words, without the need for a super powerful computer, pre-trained models can be loaded onto an NCS and used during inference – the prediction step. The NCS is fairly small (72.5mm X 27mm X 14mm) which makes it perfect for low-power applications like drone navigation systems. In this report, I’ll explain the major steps that are necessary for identifying object in a live video stream using an NCS.

1. Installing the Intel® Movidius™ software developer kit (SDK)

I used the Ubuntu 16.04 operating system for installing the Intel® Movidiussoftware developer kit (SDK). I just followed the instructions which are pretty straightforward. However, one small point that might be helpful: if you have multiple versions of Python environments installed already, then make sure that they are assigned different names. In other words, instead of calling both of them Python, use a naming convention like Python2 and Python3 respectively. Otherwise, during installation, the SDK might have some difficulties identifying the Python environment.

2. Loading pre-included examples

Once the SDK is installed it’s time to try out some of the examples. You can just go to the example folder and start.

3. Object classification in video

For the sake of this report, I tried to capture live video from laptop camera and then load the pre-trained neural network model onto the NCS in order to classify objects in video frames. The model was already trained and included the SDK examples. The model’s name is GoogLeNet that is based on this paper. As mentioned earlier, one of the major benefits of NCS is, it does not need the model to be trained again. The model just gets loaded onto the NCS and it predicts the label of the object of interest in the frame.

Below we take a look at some of the objects that were shown to the camera and the corresponding model prediction. I just want to emphasize again that just the pre-trained model is loaded onto the NCS and the NCS is doing the prediction task (inference). The model makes the predictions and the top 5 results with highest probabilities are shown on the screen. The screenshots below show some of the examples of the model prediction along with their probability. The live video shows the full demo.

Note that I did use a sheet of white paper as the background to reduce any type of noise that might come from the environment.

Actual Label: eyeglasses

NCS model top predictions: stethoscope, knot, sandal, whistle, sunglasses

Actual Label: remote control

NCS model top predictions: electric switch, mousetrap, band aid, remote, harmonica

Actual Label: computer mouse

NCS model top predictions: computer mouse, car mirror, lens cap, switch, spotlight

Actual Label: iPhone

NCS model top predictions: iPod, cell, ignitor, remote, pencil box

Actual Label: screw driver

NCS model top predictions: pencil eraser, screwdriver, mouse, paint brush.

Actual Label: notebook computer

NCS model top predictions: notebook computer, keypad, laptop, space bar, mouse

Actual Label: pen

NCS model top predictions: ball pen, paper knife, quill, screw driver

Actual Label: binder clip

NCS model top predictions: whistle, mousetrap, toaster, modem, traffic light

Analysis of the model prediction, especially, in cases that the model makes the wrong prediction shows that model is not far off from reality. For example for the case of Binder Clip the model predicted it as a mousetrap, which is not unreasonable if we think about it. Given that model might not have seen a binder clip in its training data.

4. Next steps

I would like to improve the model performance so that it can give me better accuracy next time. Possible improvement to the model could be pre-processing input images so that it can have a robust performance under different lighting situations. Another future task is, connecting NCS to a Raspberry PI camera because eventually, I want to use this NCS on a drone which might have a processing unit like a Raspberry PI. Finally, using a more powerful model that can deal with environment noise is another aspect to consider for the next step.

Continue to next post: Drone Navigation in Semi-Cluttered Environment – Update 2

For more complete information about compiler optimizations, see our Optimization Notice.