How It Works
Step through the process of using the Intel® Distribution of OpenVINO™ toolkit, and take a closer look at the key phases from setting up and planning your solution to deployment.
The Workflow
This toolkit contains a full suite of development and deployment tools. To try building a project from start to finish, use the Intel® DevCloud for the Edge, which includes a fully configured set of pretrained models and hardware for evaluation.
Prerequisite: Plan and Set Up
Select your host and target platforms, and make choices about models.
Determine Environments and Configuration
The toolkit supports operating systems such as Linux*, Windows*, macOS*, and Raspbian*. While representations and code are agnostic of the target device and operating systems, you may need to create deployment packages within the specific environment.
Supported Development Platforms
Determine Model Type and Framework
The toolkit supports deep learning model training frameworks such as TensorFlow*, Caffe*, MXNet*, and Kaldi*, as well as the ONNX* model format. The support also includes most layers within those frameworks. In addition, the toolkit can be extended to support custom layers.
Step 1: Train Your Model
Use your framework of choice to prepare and train a deep learning model.
Use Pretrained Models
Find an open-source pretrained model or build your own model. The Open Model Zoo provides an open-source repository of optimized, pretrained models for various general tasks, such as object recognition, pose estimation, text detection, and action recognition. Validated support for public models and a collection of code samples and demos are also open-source within the repository. The repository is licensed under Apache 2.0.
Prepare Your Model
Use scripts or a manual process to configure the Model Optimizer for the framework used to train the model.
Step 2: Convert and Optimize
Run the Model Optimizer to convert your model and prepare it for inferencing.
Run the Model Optimizer
Run the Model Optimizer and convert the model to an intermediate representation (IR), which is represented in a pair of files (.xml and .bin). These files describe the network topology and contain the weights and biases binary data of the model.
Post-Conversion Checks and Validation
Along with the pair of files (.xml and .bin), the Model Optimizer also outputs diagnostic messages to aid in further tuning. In addition, the open-source tool, Accuracy Checker, may aid in validating the accuracy of the model. To accelerate inference and convert models into hardware-friendly representations (for example, lower precision such as INT8) that do not require retraining, use the Post-Training Optimization Tool.
Step 3: Tune for Performance
Use the Inference Engine to compile the optimized network and manage inference operations on specified devices.
Run the Inference Engine
Load and compile the optimized model and run inference operations on the input data, and then output the results. The Inference Engine is a high-level (C, C++, or Python) inference API with an interface that is implemented as dynamically loaded plugins for each hardware type. It delivers the optimal performance for each hardware without the need to implement and maintain multiple code pathways.
Optimize Performance
Additional tools within the toolkit help to improve performance.
- The Benchmark App analyzes the performance of your model
- The Cross-Check Tool compares accuracy and performance between two successive model inferences.
- The Deep Learning Workbench allows you to visualize, fine-tune, and compare the performance of deep learning models.
Step 4: Deploy Applications
Use the Inference Engine to deploy your applications.
Call the Inference Engine
When you use the Inference Engine, it can be called as a core object with extensions, which then loads an optimized nGraph network model to the specific target device. With the network loaded, the Inference Engine can accept data and requests for inference execution and deliver output data.
Inference Engine Developer Guide
Deploy to Runtime Environments
Use the Deployment Manager to create a deployment package by assembling the model, IR files, your application, and associated dependencies into a runtime package for your target device.
Toolkit Add-Ons
Computer Vision Annotation Tool
This web-based tool helps annotate videos and images before training a model.
Use this add-on to build, transform, and analyze datasets.
Consider this analytics framework to create and deploy complex media analytics pipelines with the Intel Distribution of OpenVINO toolkit.
Neural Network Compression Framework
Use this framework based on PyTorch for quantization-aware training.
This scalable inference server is for serving models optimized with the Intel® Distribution of OpenVINO™ toolkit.
This add-on protects and allows control of a model through secure packaging and model execution. Using kernel-based virtual machines (KVM) and Docker* containers, it is compatible with the OpenVINO™ model server for a scalable serving microservice. This new add-on enables packaging for flexible deployment and security.
Access trainable deep learning models for training with custom data.