OpenVINO™ Toolkit Reference Implementations

  • Overview
  • Transcript

This episode of the IoT Developer Show takes a look at reference implementations and code samples for the OpenVINO™ toolkit. Get an introduction to several Intel code samples and reference implementations for developers that demonstrate a wide range of OpenVINO™ toolkit capabilities.

OpenVINO™ Toolkit

Reference Implementations

Code Samples

Subscribe to the Intel® Software YouTube* Channel

Hello. My name is Martin Kronberg. Welcome to the IoT Developers Show. In this series we take a deep dive into Intel's OpenVINO Toolkit, and today we look at some Intel reference implementations and sample code to help developers get started with the toolkit. 

There are two different collections of code that I want to go over today. Both are found on the Intel Developer Zone, and they're both posted in the links. The first is a collection of code samples on the OpenVINO Toolkit page showing specific features and processing pipelines. These samples are broken down into four sections-- deep learning inference engine-- this shows how to use convolutional neural networks for image processing on Intel-based platforms; The second being OpenCV which shows how to use OpenCV for various image transformations and object detection; the third being the introductory and advanced OpenVX. This uses OpenVX for image transformation, heterogeneous pipelines, and some advanced image processing; and finally domain-specific OpenVX which covers media processing with GStreamer, automotive lane detection, and image enhancement for printing. 

I want to highlight a couple of the deep-learning code samples and show them running on an IEI Tank AIoT developer kit. The first sample does object detection using the VGG16 faster R-CNN model. This is an implementation of the faster R-CNN model which uses a convolutional neural network to detect objects and draw bounding boxes around them. 

If you're like me and are wondering why this is called faster R-CNN, well, it turns out that it's a more optimized version of the fast R-CNN model, which sort of paints them into a naming corner if they want to optimize it further. I don't know, maybe they can call it 2 fast 2 R-CNN Anyway, if you need to do object detection in real time, this is a great code sample to start with. 

Next I want take a look at the multichannel face-detection sample. This sample sets up a pipeline to detect faces from multiple video streams. It uses the face-detection-retail-004 model, which is a custom-built single-shot detector model from Intel. We'll use multiple USB cameras to create our video streams and then process them using this pipeline. I can set a heterogeneous run mode on the command line using the -d flag. Here, I'm specifying that I want to run on CPU first and then fall back to GPU. 

There are many more useful samples on the site, including stabilizing videos, lane detection, and face detection. This is a great place to see all the features that OpenVINO has to offer. 

In addition to these code samples, Intel has also created a number of reference implementations which cover more fully realized solutions. There is a facial-recognition access control, a people-counter system, an intruder detector, and a few others. These implementations consist of both a codebase as well as detailed descriptions and front-end user interfaces for the applications. 

Let's take a look at one of the latest implementations, the shopper-gaze monitor. This application is designed for a retail-shelf-mounted camera system that counts the number of passers by that look towards the display versus the number of people that pass by the display without looking. Basically it's a way of gathering information about how effective your advertisement is. 

The application uses a video source, such as a camera. to grab frames and then uses two different deep neural networks to process the data. The first network looks for faces and counts them as shoppers. A second neural network is then used to determine the head pose for each detected face. If the person's head is facing towards the camera, it's counted as a looker. These two neural network models are custom-built single-shot detectors for face detection and head-pose detection, respectively. 

The program creates three threads-- the main thread that performs the video input/output, a worker thread that processes video frames using the deep neural networks, and a worker thread that publishes any messages over MQTT, which is a common machine-to-machine protocol. 

Finally, I want to mention that Intel has created a number of videos specifically for testing this and all of our other code samples and implementations. The videos include things like bolt on a conveyor belt to test flaw protection and people entering an area to look at a camera to test the gaze attention that I discussed today. All the videos are open source, so you can feel free to use them in all your applications. 

And that does it for today. Thank you so much for watching. As always, all the links are provided, and tune in next time for our final episode of season two of the IoT Developer Show.