Image classification is a computer vision problem that aims to classify a subject or an object present in an image into predefined classes. A typical real-world example of image classification is showing an image flash card to a toddler and asking the child to recognize the object printed on the card. Traditional approaches to providing such visual perception to machines have relied on complex computer algorithms that use feature descriptors, like edges, corners, colors, and so on, to identify or recognize objects in the image.
Deep learning takes a rather interesting, and by far most efficient approach, to solving real-world imaging problems. It uses multiple layers of interconnected neurons, where each layer uses a specific computer algorithm to identify and classify a specific descriptor. For example if you wanted to classify a traffic stop sign, you would use a deep neural network (DNN) that has one layer to detect edges and borders of the sign, another layer to detect the number of corners, the next layer to detect the color red, the next to detect a white border around red, and so on. The ability of a DNN to break down a task into many layers of simple algorithms allows it work with a larger set of descriptors, which makes DNN-based image processing much more effective in real-world applications.
NOTE: the above image is a simplified representation of how a DNN would identify different descriptors of an object. It is by no means an accurate representation of a DNN used to classify STOP signs.
Image classification is different from object detection. Classification assumes there is only one object in the entire image, sort of like the ‘image flash card for toddlers’ example I referred to above. Object detection, on the other hand, can process multiple objects within the same image. It can also tell you the location of the object within the image.
You will build...
A program that reads an image from a folder and classifies them into the top 5 categories.
You will learn...
You will need...
If you haven’t already done so, install NCSDK on your development machine. Refer NCS Quick Start Guide for installation instructions.
If you would like to see the final output before diving into programming, download the code from our sample code repository (NC App Zoo) and run it.
mkdir -p ~/workspace cd ~/workspace git clone https://github.com/movidius/ncappzoo cd ncappzoo/apps/image-classifier make run
make rundownloads and builds all the dependent files, like the pre-trained networks, binary graph file, ilsvrc dataset mean, etc. We have to run
make runonly for the first time; after which we can run
You should see an output similar to:
------- predictions -------- prediction 1 is n02123159 tiger cat prediction 2 is n02124075 Egyptian cat prediction 3 is n02113023 Pembroke, Pembroke Welsh corgi prediction 4 is n02127052 lynx, catamount prediction 5 is n02971356 carton
Thanks to NCSDK’s comprehensive API framework, it only takes a couple lines of Python scripts to build an image classifier. Below are some of the user configurable parameters of image-classifier.py:
GRAPH_PATH: Location of the graph file, against with we want to run the inference
IMAGE_PATH: Location of the image we want to classify
IMAGE_DIM: Dimensions of the image as defined by the choosen neural network
IMAGE_STDDEV: Standard deviation (scaling value) as defined by the choosen neural network
IMAGE_MEAN: Mean subtraction is a common technique used in deep learning to center the data
Before using the NCSDK API framework, we have to import mvncapi module from mvnc library
import mvnc.mvncapi as mvnc
Just like any other USB device, when you plug the NCS into your application processor’s (Ubuntu laptop/desktop) USB port, it enumerates itself as a USB device. We will call an API to look for the enumerated NCS device.
# Look for enumerated Intel Movidius NCS device(s); quit program if none found. devices = mvnc.EnumerateDevices() if len( devices ) == 0: print( 'No devices found' ) quit()
Did you know that you can connect multiple Neural Compute Sticks to the same application processor to scale inference performance? More about this in a later article, but for now let’s call the APIs to pick just one NCS and open it (get it ready for operation).
# Get a handle to the first enumerated device and open it device = mvnc.Device( devices ) device.OpenDevice()
To keep this project simple, we will use a pre-compiled graph of a pre-trained AlexNet model, which was downloaded and compiled when you ran
make inside the
ncappzoo folder. We will learn how to compile a pre-trained network in an another blog, but for now let’s figure out how to load the graph into the NCS.
# Read the graph file into a buffer with open( GRAPH_PATH, mode='rb' ) as f: blob = f.read() # Load the graph buffer into the NCS graph = device.AllocateGraph( blob )
The Intel Movidius NCS is powered by the Intel Movidius visual processing unit (VPU). It is the same chip that provides visual intelligence to millions of smart security cameras, gesture controlled drones, industrial machine vision equipment, and more. Just like the VPU, the NCS acts as a visual co-processor in the entire system. In our case, we will use the Ubuntu system to simply read images from a folder and offload it to the NCS for inference. All of the neural network processing is done solely by the NCS, thereby freeing up the application processor’s CPU and memory resources to perform other application-level tasks.
In order to load an image onto the NCS, we will have to pre-process the image.
LoadTensorfunction-call to load the image onto NCS.
# Read & resize image [Image size is defined during training] img = print_img = skimage.io.imread( IMAGES_PATH ) img = skimage.transform.resize( img, IMAGE_DIM, preserve_range=True ) # Convert RGB to BGR [skimage reads image in RGB, but Caffe uses BGR] img = img[:, :, ::-1] # Mean subtraction & scaling [A common technique used to center the data] img = img.astype( numpy.float32 ) img = ( img - IMAGE_MEAN ) * IMAGE_STDDEV # Load the image as a half-precision floating point array graph.LoadTensor( img.astype( numpy.float16 ), 'user object' )
Depending on how you want to integrate the inference results into your application flow, you can choose to use either a blocking or non-blocking function call to load tensor (previous step) and read inference results. We will learn more about this functionality in a later blog, but for now let’s just use the default, which is a blocking call (no need to call a specific API).
# Get the results from NCS output, userobj = graph.GetResult() # Print the results print('\n------- predictions --------') labels = numpy.loadtxt( LABELS_FILE_PATH, str, delimiter = '\t' ) order = output.argsort()[::-1][:6] for i in range( 0, 5 ): print ('prediction ' + str(i) + ' is ' + labels[order[i]]) # Display the image on which inference was performed skimage.io.imshow( IMAGES_PATH ) skimage.io.show( )
In order to avoid memory leaks and/or segmentation faults, we should close any open files or resources and deallocate any used memory.
Congratulations! You just built a DNN-based image classifier.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804