Release Notes for Intel® Distribution of OpenVINO™ Toolkit 2021

By Andrey Zaytsev, Alina Alborova

Published:10/06/2020   Last Updated:10/06/2020

Note For the Release Notes for the 2020 version, refer to Release Notes for Intel® Distribution of OpenVINO™ toolkit 2020

Introduction

The Intel® Distribution of OpenVINO™ toolkit is a comprehensive toolkit for quickly developing applications and solutions that solve a variety of tasks including emulation of human vision, automatic speech recognition, natural language processing, recommendation systems, and many others. Based on latest generations of artificial neural networks, including Convolutional Neural Networks (CNNs), recurrent and attention-based networks, the toolkit extends computer vision and non-vision workloads across Intel® hardware, maximizing performance. It accelerates applications with high-performance, AI and deep learning inference deployed from edge to cloud.

The Intel® Distribution of OpenVINO™ toolkit:

  • Enables deep learning inference from the edge to cloud.
  • Supports heterogeneous execution across Intel accelerators, using a common API for the Intel® CPU, Intel® Integrated Graphics, Intel® Gaussian & Neural Accelerator, Intel® Neural Compute Stick 2, Intel® Vision Accelerator Design with Intel® Movidius™ VPUs.
  • Speeds time-to-market through an easy-to-use library of CV functions and pre-optimized kernels.
  • Includes optimized calls for CV standards, including OpenCV* and OpenCL™.

New and Changed in the Release 2

Executive Summary

  • Integrates the Deep Learning Workbench with the Intel® DevCloud for the Edge as a Beta release. Graphically analyze models using the Deep Learning Workbench on the Intel® DevCloud for the Edge (instead of a local machine only) to compare, visualize and fine-tune a solution against multiple remote hardware configurations.
  • Introduces support for Red Hat Enterprise Linux (RHEL) 8.2. See System Requirements for more info.
  • Introduces per-channel quantization support in the Model Optimizer for models quantized with TensorFlow Quantization-Aware Training containing per-channel quantization for weights, which improves performance by model compression and latency reduction.
  • Pre-trained models and support for public models to streamline development:
    • Public Models: Yolov4 (for object detection), AISpeech (for speech recognition), and DeepLabv3 (for semantic segmentation)
    • Pre-trained Models: Human Pose Estimation (update), Formula Recognition Polynomial Handwritten (new), Machine Translation (update), Common Sign Language Recognition (New), and Text-to-Speech (new)
  • New OpenVINO™ Security Add-on, which controls access to model(s) through secure packaging and execution. Based on KVM Virtual machines and Docker* containers and compatible with the OpenVINO™ Model Server, this new add-on enables packaging for flexible deployment and controlled model access.
  • PyPI project moved from openvino-python to openvino, and 2021.1 version to be removed in the default view. The specific version is still available for users depending on this exact version by using openvino-python==2021.1

Model Optimizer

Model Optimizer

Common changes

  • Updated requirements for the numpy component to avoid compatibility issues with TensorFlow 1.x.
  • Improved reshape-ability of models with eltwise and CTCGreedyDecoder operations

ONNX*

  • Enabled an ability to specify the model output tensor name using the "--output" command line parameter.
  • Added support for the following operations:
    • Acosh
    • Asinh
    • Atanh
    • DepthToSpace-11, 13
    • DequantizeLinear-10 (zero_point must be constant)
    • HardSigmoid-1,6
    • QuantizeLinear-10 (zero_point must be constant)
    • ReduceL1-11, 13
    • ReduceL2-11, 13
    • Resize-11, 13 (except mode="nearest" with 5D+ input, mode="tf_crop_and_resize", and attributes exclude_outside and extrapolation_value with non-zero values)
    • ScatterND-11, 13
    • SpaceToDepth-11, 13

TensorFlow*

  • Added support for the TensorFlow Object Detection API models with pre-processing block when mean/scale values are applied prior to resizing of the image. Previously only the case when mean/scale values are applied after the resize was supported.
  • Aligned FakeQuantized limits adjustment with TensorFlow approach
  • Added support for the following operations:
    • GatherND
    • Round
    • NonMaxSuppression
    • LogSoftmax
    • FakeQuantWithMinMaxVarsPerChannel

MXNet*

  • Added support for the following operations:
    • GatherND
    • Round

Kaldi*

  • Added support for the following operations:
    • TdnnComponent

Inference Engine

Inference Engine

Common changes

  • Removed dependency on inference_engine_legacy. Since 2021.2, customer's application don't have directly link inference_engine_legacy. inference_engine_legacy is linked by plugins directly. 
  • Added support of reading ONNX models which have external data files. To read such models need to pass in the core.ReadNetwork() method only path to ONNX model, external data files will be found and loaded automatically.
  • The logic to detect supported models was improved for the ONNX reader.
  • ONNX dependency updated to v1.7.0
  • Added support for ONNX functions (bottom part of the operators list https://github.com/onnx/onnx/blob/v1.7.0/docs/Operators.md)
  • Improved the documentation and examples about registering custom ops in the ONNX importer
  • setBatchSize method now moved to reshape method logic to update input shapes of the model. Additionally, it applies Smart Reshape transformations that are relaxing some non-reshape-able patterns in the model. Using setBatchSize and reshape methods for the same model is now eligible and won't lead to undefined behavior as it was in previous releases.

  • On Windows platform, Inference Engine libraries have the new "Details"  section in file properties. This section contains information about DLL, including library description and version.

Deprecated API

  • ExecutableNetwork::QueryState method replaced by InferRequest::QueryState method, the old one is deprecated.
  • IVariableState::GetLastState method was renamed to IVariableState::GetState, old one is deprecated
  • IMemoryState was renamed to IVariableState, old name still can be used but not recommended

CPU Plugin

CPU Plugin

  • Added support for new operations:
    • Loop-5
    • Round-5
    • NonMaxSuppression-3, NonMaxSuppression-5
    • HSigmoid-5
    • LogSoftmax-5
    • GatherND-5
  • Implemented multiple optimizations for CTCLoss, Pad, Permute and Elementwise operations. This effort resulted in improved CPU performance on customers models and significantly increased overall performance geomean on Open Model Zoo scope.
  • Added support of I64/U64 data types on dynamic inputs (via internal conversion to I32).
  • State API was improved and now can be used in applications with several parallel infer requests:
    • MKLDNN plugin implementation of IVariableState::GetName() method is fixed and return variable ids now
    • Added support of IVariableState::GetState in MKLDNN plugin

GPU Plugin

GPU Plugin

  • Support for Intel® Iris® Xe MAX Graphics (formerly codenamed DG1) 
  • Added support for the following operations:
    • HSigmoid-5
    • Round-5
    • LogSoftMax-5
  • Performance improvements for int8 convolutions with asymmetric quantization
  • Added mechanism for compiled kernels caching on the plugin side which might be used instead of cl_cache in the driver.

MYRIAD Plugin

MYRIAD Plugin

  • Added the support for new operations:
    • HSwish
    • GatherND
    • Interpolate
    • Ceil
  • Added "bidirectional" mode for Broadcast operation.
  • Added second optional output for Proposal operation.
  • Improved the performance of existing operations:
    • Mish
    • Swish
    • NonMaxSuppression

HDDL Plugin

HDDL Plugin

  • Same new operations and optimizations as in the MYRIAD plugin.
  • Enabled Linux kernel 5.4 support for ION driver.

GNA Plugin

GNA Plugin

  • Model export now saves layer names, so they can be reused after import.
  • Fixed some layer combinations.

nGraph

  • Introduced opset5. The new opset contains the new operations listed below. Not all OpenVINO™ toolkit plugins support the operations.
    • BatchNormInference-5
    • GRUSequence-5
    • RNNSequence-5
    • LSTMSequence-5
    • Loop-5
    • Round-5
    • NonMaxSuppression-5
    • HSigmoid-5
    • LogSoftmax-5
  • Implemented public nGraph transformations:
    • LowLatency
      The transformation unrolls the TensorIterator nodes to infer them step-by-step with low latency and states are stored from inference to inference. The transformation changes number of iterations to 1 and replaces back-edges (e.g. RNN states inputs and outputs) with ReadValue and Assign operations. The transformation is available for CPU and GNA plugins.
  • Public nGraph API changes:
    • The Sink class has been introduced to conveniently identify operations that are "sinks" (nodes are not consumed by any other nodes) of the graph. The nGraph Function API was extended by methods "add/remove sinks". Currently, only Assign nodes are inherited from the Sync class, Result nodes are special and are stored separately, not sinks.

  • Continued clean up of nGraph original codebase from before the integration with the Intel® Distribution of OpenVINO™ toolkit resulting in the removal of legacy operations that are not supported by the toolkit.

Neural Networks Compression Framework (NNCF)

  • Integrated NNCF with OTE/mmdetection for Single-stage Object Detection case.
  • Released NNCF v1.5 for PyTorch:
    • Switched propagation-based mode for quantizer setup by default (better integration with HW configs).
    • Implemented improvements for HAWQ mixed-precision quantization algorithm: compression ratio parameter support, activation quantizers bitwidth selection, more generic way to calculate the loss.
    • Supported unified scales for EltWise through VPU HW Config.
    • Enabled GPT2 compression, added pruned googlenet-v1 to the list of supported models.
    • See NNCF Release Notes for details and full list of features.

Post-Training Optimization Tool

Post-Training Optimization Tool

  • Introduced model presets in the POT configuration and particularly the preset for Transformer models which allows POT users to easier quantize such models.
  • Improved POT documentation including the quantization example. Added the Frequently Asked Questions document. 
  • Extended models coverage: +45 models enabled.

Deep Learning Workbench

Deep Learning Workbench

  • Distribution: DL Workbench is now available in the Intel DevCloud for the Edge
  • Added support for GAN models for style-transfer, super-resolution, and inpainting use cases 
  • Added the ability to export profiling experiment results in CSV format

OpenCV*

  • Updated version to 4.5.1.
  • Added the support for width/height properties in Media SDK (MFX) backend of VideoCapture API.
  • G-API: Added more CV operations, Python bindings for Inference and Streaming APIs, introduced MediaFrame data type for media formats support (for example, NV12).

Samples

  • The order of input layers (for input data files) and output layers (for output and reference files) in command line arguments of the speech sample can now be explicitly specified using new command-line arguments (-iname and -oname).

Open Model Zoo

  • Extended the Open Model Zoo with additional CNN-pretrained models and pregenerated Intermediate Representations (.xml + .bin):

    • Replaced the 2021.1 models:

      • text-spotting-0003-detector
      • text-spotting-0003-recognizer-decoder
      • text-spotting-0003-recognizer-encoder
    • Added new models:
      • bert-small-uncased-whole-word-masking-squad-int8-0002
      • bert-small-uncased-whole-word-masking-squad-emb-int8-0001
      • formula-recognition-polynomials-handwritten-0001-decoder
      • formula-recognition-polynomials-handwritten-0001-encoder
      • handwritten-simplified-chinese-recognition-0001
      • human-pose-estimation-0002
      • human-pose-estimation-0003
      • human-pose-estimation-0004
      • person-detection-0003
    • End-of-lifed models:
      • bert-large-whole-word-masking-squad-fp32-0001 renamed to bert-large-uncased-whole-word-masking-squad-0001
  • The list of public models extended with the support for the following models:

    Model Name

    Task

    Framework

    anti-spoof-mn3

    classification

    PyTorch

    cocosnet image_translation PyTorch
    colorization-v2 image_processing PyTorch
    colorization-siggraph image_processing PyTorch
    common-sign-language-0001 classification PyTorch
    efficientdet-d0-tf object_detection TensorFlow
    efficientdet-d1-tf object_detection TensorFlow
    forward-tacotron-duration-prediction text_to_speech PyTorch
    forward-tacotron-regression text_to_speech PyTorch
    fcrn-dp-nyu-depth-v2-tf depth_estimation TensorFlow
    hrnet-v2-c1-segmentation semantic_segmentation PyTorch
    mozilla-deepspeech-0.8.2 speech_recognition TensorFlow
    shufflenet-v2-x1.0 classification PyTorch
    wavernn-rnn text_to_speech PyTorch
    wavernn-upsampler text_to_speech PyTorch
    yolact-resnet50-fpn-pytorch instance_segementation PyTorch
    yolo-v4-tf object_detection TensorFlow
  • Replaced old Caffe variants of colorization models with PyTorch variants of the same models.

    • Added new demo applications:
    • Python gesture_recognition_demo (replaces asl_recognition_demo)
    • Python human_pose_estimation_demo (with support of new human-pose-estimation-0002/3/4 models)
    • Python image_translation_demo
    • Python text to speech demo
    • Python object_detection_demo (replaces object_detection_demo_centernet, object_detection_demo_faceboxes, object_detection_demo_retinaface, object_detection_demo_ssd_async and object_detection_demo_yolov3_async)
    • C++ object_detection_demo (replaces object_detection_demo_ssd_async and object_detection_demo_yolov3_async)
  • Removed deprecated object_detection_demo_faster_rcnn.

  • Open Model Zoo tools:

    • Extended the Model Converter with support of a custom preconvert script, which simplifies conversion of non-frozen model graphs.
    • Extended the Accuracy Checker with coverage of new tasks: image based localization, salient map detection, optical flow estimation, DNA sequencing.
    • Added command-line options for setting input precision and getting intermediate metrics results in the Accuracy Checker.
    • Improved work with GAN models in the Accuracy Checker, extended postprocessing, added new metrics - Inception Score and  Frechet Inception Distance.
    • Tensorflow 2.3 is required to convert efficientdet-d0/d1 models.

Deep Learning Streamer

  • Direct ONNX model support: DL Streamer gvadetect, gvaclassify, and gvainference elements will now support ONNX models supported by OpenVINO™ Inference Engine on CPU without converting to Intermediate Representation (IR) format.
  • Full frame and ROI based inference: A new property 'inference-region' added to gvadetect, gvaclassify, and gvainference element will allow developers to run inference on full-frame or ROI (Region of Interest) for use cases such as back-to-back detection, and full-frame classification.
  • Imageless Object tracking: Two new algorithms 'short-term imageless' and 'zero-term imageless' introduced in gvatrack will provide an ability to track the objects without accessing image data. 
  • Docker file updates: The folder structure created with the Docker file in DL Streamer GitHub is aligned with the Docker image released by OpenVINO™ on DockerHub*. Developers can now use the same instructions and guidelines for using DL Streamer regardless of the chosen way of distribution  (OpenVINO Installer, OpenVINO Docker image, DL Streamer Docker file, building from source).

For more information on DL Streamer, see the DL Streamer tutorial, API reference, and samples located at the DL Streamer open-source project repository OpenVINO™ Toolkit - DL Streamer on GitHub. The documentation for samples is also available at DL Streamer Samples.

OpenVINO™ Model Server

Model Server

  • Directed Acyclic Graph (DAG) scheduler – (formerly `models ensemble`) this feature was first available as a preview in 2021.1. It is now officially supported, making it possible to define inference pipelines composed of multiple interconnected models that respond to a single prediction request. In this release we are adding support for remaining API calls which were not supported for DAGs in the preview, specifically `GetModelStatus` and `GetModelMetadata`. `GetModelStatus` returns the status of the complete pipeline while GetModelMetadata returns the pipeline inputs and outputs parameters. The new 2021.2 release has improved DAG config validation.
  • Direct import of ONNX models – it is now possible to serve ONNX models without converting to Intermediate Representation (IR) format. This helps simplify deployments using ONNX models and the PyTorch training framework.
  • Custom loaders and integration with OpenVINO™ Security Add-on – it is now possible to define a custom library to handle model loading operations – including additional steps related to model decryption and license verification. Review the documentation of the Security Add-on component to learn about model protection.
  • Traffic Encryption – new deployment recipe for client authorization via mTLS certificates and traffic encryption by integrating with NGINX reverse proxy in a Docker container. 
  • Remote Model Caching from cloud storage – models stored in Google Cloud Storage (GCS), Amazon S3 and Azure blob will no longer be downloaded multiple times after configuration changes that require model reloading. Cached model(s) will be used during the model reload operation. When a served model is changed, only the corresponding new version folder will be added to the model storage.

OpenVINO™ Security Add-on

OpenVINO™ Security Add-on

The Security Add-on is a  set of tools that allows a model developer to control access to models post-development, and check for access to models at run time within a controlled environment using the OpenVINO™ Model Server. It consists of a set of development tools to define access controls to a model, a licensing service to check model license prior to loading into model server and an isolated environment within which an access controlled model can be executed within the OpenVINO™ Model Server.

 Key Features of Security Add-on

  • Define access controls to models soon after development.
  • Generate customer specific licenses limiting days of model use.
  • Check the license validity before loading the model into OpenVINO™ Model Server.
  • Execute models in an isolated environment via KVM Virtual Machines, with the OpenVINO™ Model Server.
  • Control application access to models via NGINX.

New and Changed in the Release 1

Executive Summary

  • Introducing a major release in October 2020 (v.2021). You are highly encouraged to upgrade to this version because there it introduces new and important capabilities, as well as breaking changes and backward-incompatible changes. 
  • Support for TensorFlow 2.2.x. Introduces official support for models trained in the TensorFlow 2.2.x framework.
  • Support for the Latest Hardware. Introduces official support for 11th Generation Intel® Core™ Processor Family for Internet of Things (IoT) Applications (formerly codenamed Tiger Lake) including new inference performance enhancements with Intel® Iris® Xe Graphics and Intel® DL Boost instructions, as well as Intel® Gaussian & Neural Accelerators 2.0 for low-power speech processing acceleration.
  • Going Beyond Vision. Enables end-to-end capabilities to leverage the Intel® Distribution of OpenVINO™ toolkit for workloads beyond computer vision, which include audio, speech, language, and recommendation, with new pre-trained models, support for public models, code samples and demos, and support for non-vision workloads in OpenVINO™ toolkit DL Streamer.
  • Coming in Q4 2020: (Beta Release) Integration of DL Workbench and the Intel® DevCloud for the Edge. Developers can now graphically analyze models using the DL Workbench on Intel® DevCloud for the Edge (instead of a local machine only) to compare, visualize and fine-tune a solution against multiple remote hardware configurations.
  • OpenVINO™ Model ServerAn add-on to the Intel® Distribution of OpenVINO™ toolkit and a scalable microservice, which provides a gRPC or HTTP/REST endpoint for inference, makes it easier to deploy models in cloud or edge server environments. It is now implemented in C++ to enable reduced container footprint (for example, less than 500MB) and deliver higher throughput and lower latency.
  • Now available through Gitee* and PyPI* distribution methods. You are encouraged to choose from the distribution methods and download. 

Backward Incompatible Changes Compared with 2020.4

  • List of Deprecated APIAPI Changes
  • IRv7 has been deprecated since 2020.3, and it is no longer supported in this release. You cannot read IRv7 and lower Core::ReadNetwork and are recommended to migrate to IRv10, the highest version. IRv10 provides a streamlined and future-ready operation set that is aligned with public frameworks along with better support for low precision models representation in order to keep accuracy when running in the quantized mode as well as the support for reshapeable models.
  • Removed the Inference Engine NNBuilder API. Use nGraph instead to create a CNN graph from C++ code.
  • Removed the following Inference Engine public API:
    • InferencePlugin, IInferencePlugin, and InferencEnginePluginPtr classes. Use the Core class instead.
    • PluginDispatcher class. Use the Core class instead.
    • CNNNetReader class. Use Core::ReadNetwork instead.
    • PrimitiveInfo, TensorInfo and ExecutableNetwork::GetMappedTopology. Use ExecutableNetwork::GetExecGraphInfo instead.
    • ICNNNetworkStats, NetworkNodeStats, CNNNetwork::getStats and CNNNetwork::setStat. Use IRv10 with FakeQuantize approach for INT8 flow replacement.
    • IShapeInferExtension and CNNNetwork::addExtension. Use IExtension class as a container for nGraph::Nodes which implement shape inference.
    • IEPlugin from the Inference Engine Python API. Use the Core API instead.
    • Data::getCreatorLayer, Data::getInputTo and CNNLayer. Use CNNNetwork::getFunction to iterate over a graph.
  • Starting with the OpenVINO™ toolkit 2020.2 release, all of the features previously available through nGraph have been merged into the OpenVINO™ toolkit. As a result, all the features previously available through the ONNX RT Execution Provider for nGraph have been merged with the ONNX RT Execution Provider for OpenVINO™ toolkit. Therefore, the ONNX RT Execution Provider for nGraph will be deprecated starting June 1, 2020 and will be completely removed on December 1, 2020. Migrate to the ONNX RT Execution Provider for the OpenVINO™ toolkit as the unified solution for all AI inferencing on Intel® hardware.
  • Deprecated or removed the following nGraph public API:
    • Removed all nGraph methods and classes, which have been deprecated in previous releases.
    • Removed the GetOutputElement operation.
    • Replaced copy_with_new_args() by clone_with_new_inputs().
    • Removed opset0 and back propagation operations.
    • Removed some operations from opset0, deprecated operations from the opset, which are not used in newer opsets.
    • Removed the support for the serialization nGraph function to the JSON format.
    • Deprecated FusedOp.
  • Changed the structure of the nGraph public API. Removed nGraph builders and reference implementations from nGraph public API. Joined subfolders that have fused and experimental operations with the common operation catalog.
  • Changed the System Requirements. Review the section below.
  • Intel® will be transitioning to the next-generation programmable deep-learning solution based on FPGAs in order to increase the level of customization possible in FPGA deep-learning. As a part of this transition, future standard, that is non-LTS, releases of the Intel® Distribution of OpenVINO™ toolkit will no longer include the Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA. Intel® Distribution of OpenVINO™ toolkit 2020.3.X LTS release will continue to support Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA. For questions about next-generation programmable deep-learning solutions based on FPGAs, talk to your sales representative or contact us to get the latest FPGA updates.

Model Optimizer

Model Optimizer

Common changes

  • Implemented several optimization transformations to replace sub-graphs of operations with HSwish, Mish, Swish and SoftPlus operations.
  • Model Optimizer generates IR keeping shape-calculating sub-graphs by default. Previously, this behavior was triggered if the "--keep_shape_ops" command line parameter was provided. The key is ignored in this release and will be deleted in the next release. To trigger the legacy behavior to generate an IR for a fixed input shape (folding ShapeOf operations and shape-calculating sub-graphs to Constant), use the "--static_shape" command line parameter. Changing model input shape using the Inference Engine API in runtime may fail for such an IR.
  • Fixed Model Optimizer conversion issues resulted in non-reshapeable IR using the Inference Engine reshape API.
  • Enabled transformations to fix non-reshapeable patterns in the original networks:
    • Hardcoded Reshape
      • In Reshape(2D)->MatMul pattern
      • Reshape->Transpose->Reshape when the pattern can be fused to the ShuffleChannels or DepthToSpace operation
    • Hardcoded Interpolate
      • In Interpolate->Concat pattern
  • Added a dedicated requirements file for TensorFlow 2.X as well as the dedicated install prerequisites scripts.
  • Replaced the SparseToDense operation with ScatterNDUpdate-4.

ONNX*

  • Enabled an ability to specify the model output tensor name using the "--output" command line parameter.
  • Added support for the following operations:
    • Acosh
    • Asinh
    • Atanh
    • DepthToSpace-11, 13
    • DequantizeLinear-10 (zero_point must be constant)
    • HardSigmoid-1,6
    • QuantizeLinear-10 (zero_point must be constant)
    • ReduceL1-11, 13
    • ReduceL2-11, 13
    • Resize-11, 13 (except mode="nearest" with 5D+ input, mode="tf_crop_and_resize", and attributes exclude_outside and extrapolation_value with non-zero values)
    • ScatterND-11, 13
    • SpaceToDepth-11, 13

TensorFlow*

  • Added support for the following operations:
    • Acosh
    • Asinh
    • Atanh
    • CTCLoss
    • EuclideanNorm
    • ExtractImagePatches
    • FloorDiv

MXNet*

  • Added support for the following operations:
    • Acosh
    • Asinh
    • Atanh

Kaldi*

  • Fixed bug with ParallelComponent support. Now it is fully supported with no restrictions.

Inference Engine

Inference Engine

Common changes

  • Migrated to the Microsoft Studio C++ (MSVC) 2019 Compiler as default for Windows, which enables you to 2.5x reduce the binary size of OpenVINO™ runtime. See Reduce Application Footprint with the Latest Features in Intel® Distribution of OpenVINO™ toolkit for details.
  • See the Deprecation messages and backward-incompatible changes compared with v.2020 Release 4 section for detailed changes in the API.
  • Ported the CPU-based preprocessing path, namely resizing for different numbers of channels, layout conversions, and color space conversions, to AVX2 and AVX512 instruction sets.

Inference Engine Python API

Inference Engine Python API

  • Enabled the nGraph Python API, which enables communicating with the nGraph function using Python. This enables performing analysis of the loaded graph.
  • Enabled setting parameters of the nodes for the graph. 
  • Enabled reading ONNX models with the Python API.

Inference Engine C API

Inference Engine C API

  • No changes

CPU Plugin

CPU Plugin

  • Improved performance of CPU plugin built with MSVC compiler to align with the version built with the Intel® compiler, which enables the use of MSVC as the default compiler for binary distribution on Windows. This change resulted in more than 2x binary size reduction for CPU plugin and other components. See Reduce Application Footprint with the Latest Features in Intel® Distribution of OpenVINO™ toolkit for details.
  • Added the support for new operations:
    • ScatterUpdate-3
    • ScatterElementsUpdate-3
    • ScatterNDUpdate-3
    • Interpolate-4
    • CTC-Loss-4
    • Mish-4
    • HSwish-4

GPU Plugin

GPU Plugin

  • Support for 11th Generation Intel® Core™ Processor Family for Internet of Things (IoT) Applications (formerly codenamed Tiger Lake) 
  • Support for INT8 inference pipeline with optimizations based on Intel® DL Boost for integrated graphics.
  • Support for new operations:
    • Mish
    • Swish
    • SoftPlus
    • HSwish

MYRIAD Plugin

MYRIAD Plugin

  • Added the support for ONNX Faster R-CNN with fixed input shape and dynamic outputs shapes.
  • Added the support for automatic-DMA for custom OpenCL layers.
  • Added the support for new operations:
    • Mish
    • Swish
    • SoftPlus
    • Gelu
    • StridedSlice
    • I32 data type support in Div
  • Improved the performance of existing operations:
    • ROIAlign
    • Broadcast
    • GEMM
  • Added a new option VPU_TILING_CMX_LIMIT_KB to myriad_compile that enables limiting DMA transaction size.
  • OpenCL compiler, targeting Intel® Neural Compute Stick 2 for the SHAVE* processor only, is redistributed with OpenVINO. OpenCL support is provided by ComputeAorta*, and is distributed under a license agreement between Intel® and Codeplay* Software Ltd.

HDDL Plugin

HDDL Plugin

  • Supported automatic-DMA for custom OpenCL layers.
  • Same new operations and optimizations as in the MYRIAD plugin.
  • OpenCL compiler, targeting Intel® Vision Accelerator Design with Intel® Movidius™ VPUs for the SHAVE* processor only, is redistributed with OpenVINO. OpenCL support is provided by ComputeAorta*, and is distributed under a license agreement between Intel® and Codeplay* Software Ltd.

GNA Plugin

GNA Plugin

  • Added the support for 11th Generation Intel® Core™ Processor Family for Internet of Things (IoT) Applications (formerly codenamed Tiger Lake).
  • Added the support for a number of additional layers and layer combinations, including:
    • Convolution layers for models prepared from TensorFlow framework
    • Power layer with the power parameter different from 1
    • Concat layer with the number of input layers greater than 2
    • 4D element-wise operations
  • Added the support for importing a model from a stream.
  • Added the support for the QoS mechanism on Windows.
  • Added the support for GNA-specific parameters in the Python Benchmark app.

nGraph

  • Introduced opset4. The new opset contains the new operations listed below. Not all OpenVINO™ toolkit plugins support the operations.
    • Acosh-4
    • Asinh-4
    • Atanh-4
    • CTCLoss-4
    • HSwish-4
    • Interpolate-4
    • LSTMCell-4
    • Mish-4
    • Proposal-4
    • Range-4
    • ReduceL1-4
    • ReduceL2-4
    • ScattenNDUpdate-4
    • SoftPlus-4
    • Swish-4
  • Enabled the nGraph Python API, which enables communicating with the nGraph function using Python. This enables you to perform analysis of a loaded graph.
    • Enabled setting parameters of the nodes for the graph. 

    • Enabled reading ONNX models with the Python API.

  • Refactored the nGraph Transformation API to give it a transparent structure and make it more user-friendly. Read more in the nGraph Developer's Guide.
  • Changed the structure of the nGraph folder. nGraph public API was separated from the rest of the code, ONNX importer was moved to the frontend folder.

Neural Networks Compression Framework (NNCF)

  • Released NNCF v1.4 for PyTorch:
  • Enabled exporting pruned models to ONNX.
  • Added the support for FP16 fine-tuning for quantization.
  • Added the support for the BatchNorm adaptation as a common compression algorithm initialization step.
  • Improved the performance for per-channel quantization training. Performance is almost on par with per-tensor training.
  • Enabled default quantization of nn.Embedding and nn.Conv1d weights.
  • See NNCF Release Notes for details.

Post-Training Optimization Tool

Post-Training Optimization Tool

  • Enabled auto-tuning of quantization parameters in the Accuracy Aware algorithm.
  • Accelerated the Honest Bias Correction algorithm. The average boost of quantization time is ~4x comparing to 2020.4 for cases when 'use_fast_bias' = false.
  • Productized the Post-training Optimization Toolkit API. Provided samples and documentation to show how to use the API, which covers:
    • Integration to a user’s pipeline
    • Custom data loader, metric calculation, and execution engine
  • Default quantization scheme corresponds to compatibility mode, which needs to provide almost the same accuracy across different hardware.
  • Extended models coverage: enabled new 44 models.

Deep Learning Workbench

Deep Learning Workbench

  • Enabled import and profiling of  pretrained TensorFlow2.0 models. 
  • Enabled INT8 calibration using different presets exposed by the POT.
  • Enabled INT8 calibration on remote targets.
  • Improved visualization of an IR and a runtime graph, including graph interactions and heat maps. 
  • Added visualization of inference results in an image of user choice. The feature is in experimental mode.

OpenCV*

  • Updated version to 4.5.0.
  • Changed the upstream license to Apache 2 (PR#18073).
  • Added the support for multiple OpenCL contexts in OpenCV applications.

Samples

  • Updated Inference Engine C++ Samples to demonstrate how to load ONNX* models directly.

Open Model Zoo

  • Extended the Open Model Zoo with additional CNN-pretrained models and pregenerated Intermediate Representations (.xml + .bin):

    • Replaced the 2020.4 models:

      • face-detection-0200
      • face-detection-0202
      • face-detection-0204
      • face-detection-0205
      • face-detection-0206
      • person-detection-0200
      • person-detection-0201
      • person-detection-0202
      • person-reidentification-retail-0277
      • person-reidentification-retail-0286
      • person-reidentification-retail-0287
      • person-reidentification-retail-0288
    • Added new models:
      • bert-large-uncased-whole-word-masking-squad-emb-0001
      • bert-small-uncased-whole-word-masking-squad-0002
      • formula-recognition-medium-scan-0001-im2latex-decoder
      • formula-recognition-medium-scan-0001-im2latex-encoder
      • horizontal-text-detection-0001
      • machine-translation-nar-en-ru-0001
      • machine-translation-nar-ru-en-0001
      • person-attributes-recognition-crossroad-0234
      • person-attributes-recognition-crossroad-0238
      • person-vehicle-bike-detection-2000
      • person-vehicle-bike-detection-2001
      • person-vehicle-bike-detection-2002
      • person-vehicle-bike-detection-crossroad-yolov3-1020
      • vehicle-detection-0200
      • vehicle-detection-0201
      • vehicle-detection-0202
    • End-of-lifed models:
      • face-detection-adas-binary-0001
      • pedestrian-detection-adas-binary-0001
      • vehicle-detection-adas-binary-0001
  • The list of public models extended with the support for the following models:

    Model Name

    Framework

    aclnet

    PyTorch

    resnest-50 PyTorch
    mozilla-deepspeech-0.6.1 Tensorflow
    yolo-v3-tiny-tf Tensorflow
  • Added new demo applications:
    • bert_question_answering_embedding_demo, Python
    • formula_recognition_demo, Python
    • machine_translation_demo, Python
    • sound_classification_demo, Python
    • speech_recognition_demo, Python 
  • Open Model Zoo tools:
    • Improved the downloader speed.
    • Added the Accuracy Checker config files to each model folder. For compatibility, in their old location soft links are kept to the new location. In future releases, soft links will be removed.
    • Simplified the Accuracy Checker configuration files, no need to specify the path to the model IR or target device and precision in a configuration file. Apply these parameters as Accuracy Checker command-line options. See details in the instruction on how to use predefined configuration files.
    • Extended the Accuracy Checker with the support for optimized preprocessing operations via the Inference Engine preprocessing API.
    • Enabled ONNX models evaluation in the Accuracy Checker without conversion to the IR format.

Deep Learning Streamer

  • Expanded the DL Streamer beyond video by adding the support for audio analytics. Added a new element gvaaudiodetect for audio event detection using the AclNet model. Added an end-to-end sample of the pipeline to the samples folder.
  • Added a new element gvametaaggregate to combine the results from multiple branches of a pipeline. This enables the creation of complex pipelines by splitting a pipeline into multiple branches for parallel processing and then combining the results from various branches. 
  • Enabled GPU memory surface sharing, namely zero-copy of data, between VAAPI decode, resize, CSC, and DL Streamer inference elements on GPU to improve the overall pipeline performance.
  • Enabled GPU memory at input and output of gvatrack and gvawatermark elements, thereby removing the need to explicitly convert the memory from GPU to CPU using vaapipostproc when inference is performed on GPU. This not only makes the pipelines portable between the devices with and without GPU but also improves the performance due to the removal of a memory copy step.
  • [Preview] Extended the DL Streamer OS support to Ubuntu 20.04. On Ubuntu 20.04, the DL Streamer will use the GStreamer and its plugins provided by the OS, and thus you have access to all the elements provided by the GStreamer default installation on Ubuntu 20.04.

For more information on DL Streamer, see the DL Streamer tutorial, API reference, and samples documentation at OpenVINO™ Inference Engine Samples and a new home for the DL Streamer open source project located at OpenVINO™ Toolkit - DL Streamer repository on GitHub.

OpenVINO™ Model Server

Model Server

Model Server is a scalable high-performance tool for serving models optimized with OpenVINO™. It provides an inference service via a gRPC or HTTP/REST endpoint, enabling you to bring your models to production quicker without writing custom code.

Key Features and Enhancements

  • Improved scalability in a single server instance. With the new C++ implementation, you can use the full capacity of available hardware with linear scalability while avoiding any bottleneck on the frontend.
  • Reduced the latency between the client and the server. This is especially noticeable with high-performance accelerators or CPUs.
  • Reduced footprint. By switching to C++ and reducing dependencies, the Docker image is reduced to ~450MB.
  • Added the support for online model updates. The server monitors configuration file changes and reloads models as needed without restarting the service.

For more information about the Model Server, see the open source repo and the Model Server Release Notes. Prebuilt Docker images are available at openvino/model_server

Preview Features Terminology

A preview feature is a functionality that is being introduced to gain early feedback from developers. You are encouraged to submit your comments, questions, and suggestions related to preview features to the forum.

The key properties of a preview feature are:

  • High-quality implementation
  • No guarantee of future existence, compatibility, or security confidence.

Note A preview feature/support is subject to change in the future. It may be removed or radically altered in future releases. Changes to a preview feature do NOT require the usual deprecation and deletion process. Using a preview feature in a production code base is therefore strongly discouraged.

Known Issues

Jira ID

Description

Component

Workaround

#1 A number of issues were not addressed yet, see the Known Issues section in the Release Notes for Intel® Distribution of OpenVINO™ toolkit v.2020 All N/A
21670 FC layers with bimodal weights distribution are not quantized accurately by the Intel® GNA Plugin when 8-bit quantization is specified. Weights with values near to zero are set to zero. IE GNA plugin For now, use 16-bit weights in these cases.
25358 Some performance degradations are possible in the GPU plugin on GT3e/GT4e/ICL NUC platforms. IE GPU Plugin N/A
24709 Retrained TensorFlow Object Detection API RFCN model has significant accuracy degradation. Only the pretrained model produces correct inference results. All Use Faster-RCNN models instead of an RFCN model if retraining of a model is required.

26388

Low latency (batch size 1) graphs with LSTMCell do not infer properly due to missing state handling. All Use deprecated IRv7 and manually insert memory layers into the IR graph. Alternatively, add state tensors as extra input and output nodes and associate their blobs given the IR node IDs after loading the graph.
24101 Performance and memory consumption may be bad if layers are not 64-bytes aligned. IE GNA plugin Try to avoid the layers which are not 64-bytes aligned to make a model GNA-friendly.
30271 Performance degradation with the Python benchmark_app. IE Tools N/A
30571 The benchmark_app (CPP sample app) does not support models with the NHWC layout. IE Tools  
32927 OpenVINO™ benchmark_app is not interpreting the nthreads parameter as expected. IE Tools  
30580 When running accuracy checkout on the entire dataset for UNET2D, it throws an error. IE Tools  
28259 Slow BERT inference in the Python interface. IE Python This is only visible when importing PyTorch. Do not import the PyTorch module.
34660 Model initialization in OS X fails. IE C API  
35367 [IE][TF2] Several models failed on the last tensor check with FP32. IE MKL-DNN Plugin  
39060 LoadNetwork crash on CentOS 7 with a large number of models. IE MKL-DNN Plugin  
34087 [cIDNN] Performance degradation on several models due to upgrade of the OpenCL driver clDNN  
33132 [IE CLDNN] Accuracy and last-tensor checks regressions for FP32 models on ICLU GPU IE clDNN Plugin  
25358 [cIDNN] Performance degradation on NUC and ICE_LAKE targets on R4 IE clDNN Plugin N/A
38249 [HETERO] Hetero plugin does not support an INT8 network with a manual graph splitting into two devices. IE Hetero Plugin  
39150 MLPerf ONNX / Unet3D INT8 failed to score on CFL with segmentation fault. IE MKL-DNN Plugin  
39136 Calling LoadNetwork after a failed reshape throws an exception IE NG integration  
39175 [nGraph Python API] Absent of documentation comments in .cpp files Documentation  
44653 Models with MaxPool operator which input has dynamic rank fail to import ONNX importer N/A
44606 [pip] Broken TBB dependency PyPi

Affects 2021.1 release only.

WA: downgrade TBB to tbb==2020.3.254 or install the latest available 2021.2 version

39275 Yolov3-PyTorch incorrect names of NMS outputs nGraph N/A
42203

Customers from China may experience some issues with downloading content from the new storage https://storage.openvinotoolkit.org/ due to the China firewall

OMZ Please use a branch https://github.com/openvinotoolkit/open_model_zoo/tree/release-01org with links to old storage download.01.org
45117 Post-training Optimization Toolkit (POT) cannot be installed on Windows. POT

Affects 2021.2 release only.

WA1: remove the following text from setup.py file of POT:

'ideep4py==2.0.0.post3'

WA2: install POT using the following command:

pip install scipy==1.2.1 jstyleson==0.0.2 numpy==1.16.3 pandas==0.24.2 hyperopt==0.1.2 addict==2.2.1 chainer==7.7.0 && pip install /path/to/pot --no-deps

45045

It is not possible to use DL Workbench built by user from the package to do the remote profiling.

DL Workbench

The issue affects only package version and doesn't affect main DL Workbench distribution version from Docker Hub. You can use workaround#1 - Use DL Workbench image from Docker Hub or workaround #2 - Install Python 3.7 on remote host.

Included in This Release

The Intel® Distribution of OpenVINO™ toolkit is available in these versions:

  • OpenVINO™ toolkit for Windows*
  • OpenVINO™ toolkit for Linux*
  • OpenVINO™ toolkit for macOS*
Component License Location Windows Linux macOS

Deep Learning Model Optimizer

Model optimization tool for your trained models

Apache 2.0 <install_root>/deployment_tools/model_optimizer/* YES YES YES

Deep Learning Inference Engine

Unified API to integrate the inference with application logic

Inference Engine Headers

EULA

 

 

Apache 2.0

<install_root>/deployment_tools/inference_engine/*

 

 

<install_root>/deployment_tools/inference_engine/include/*

YES YES YES

OpenCV* library

OpenCV Community version compiled for Intel® hardware

Apache 2.0 <install_root>/opencv/* YES YES YES

Intel® Media SDK libraries (open source version)

Eases the integration between the OpenVINO™ toolkit and the Intel® Media SDK.

MIT <install_root>/../mediasdk/* NO YES NO

OpenVINO™ toolkit documentation

Developer guides and other documentation

  Available from the OpenVINO™ toolkit product site, not part of the installer packages. NO NO NO

Open Model Zoo

Documentation for models from the Intel® Open Model Zoo. Use the Model Downloader to download models in a binary format.

Apache 2.0 <install_root>/deployment_tools/open_model_zoo/* YES YES YES

Inference Engine Samples

Samples that illustrate Inference Engine API usage and demos that demonstrate how you can use features of Intel® Distribution of OpenVINO™ toolkit in your application

Apache 2.0

<install_root>/deployment_tools/inference_engine/samples/* YES YES YES

Deep Learning Workbench

Enables you to run deep learning models through the OpenVINO™ Model Optimizer, convert models into INT8, fine-tune them, run inference, and measure accuracy.

EULA <install_root>/deployment_tools/tools/workbench/* YES YES YES

Post-Training Optimization Toolkit

Designed to convert a model into a more hardware-friendly representation by applying specific methods that do not require retraining, for example, post-training quantization.

EULA <install_root>/deployment_tools/tools/post_training_optimization_toolkit/* YES YES YES

Speech Libraries and End-to-End Speech Demos

 

GNA Software License Agreement <install_root>/data_processing/audio/speech_recognition/* YES YES NO
DL Streamer EULA <install_root>/data_processing/dl_streamer/* NO YES NO

 

Where to Download this Release

System Requirements

Intel® CPU Processors

Hardware:

  • Intel® Atom* processor with Intel® SSE4.1 support
  • Intel® Pentium® processor N4200/5, N3350/5, N3450/5 with Intel® HD Graphics
  • 6th - 11th generation Intel® Core™ processors
  • Intel® Xeon® processor E3, E5, and E7 family (formerly Sandy Bridge, Ivy Bridge, Haswell, and Broadwell)
  • 2nd Generation Intel® Xeon® Scalable Processors (formerly Skylake and Cascade Lake)
  • 3rd Generation Intel® Xeon® Scalable Processors (formerly Cooper Lake)

Operating Systems:

  • Ubuntu* 18.04 long-term support (LTS), 64-bit
  • Ubuntu* 20.04 long-term support (LTS), 64-bit - preview support
  • Windows* 10, 64-bit
  • macOS* 10.15, 64-bit

Intel® Processor Graphics

Hardware:

  • Intel® HD Graphics
  • Intel® UHD Graphics
  • Intel® Iris® Xe Graphics
  • Intel® Iris® Xe Max Graphics 
  • Intel® Iris® Pro Graphics

Operating Systems:

  • Ubuntu* 18.04 long-term support (LTS), 64-bit
  • Windows* 10, 64-bit
  • Yocto* 3.0, 64-bit

Note This installation requires drivers that are not included in the Intel Distribution of OpenVINO toolkit package

Note A chipset that supports processor graphics is required for Intel® Xeon® processors. Processor graphics are not included in all processors. See Product Specifications for information about your processor.

Intel® Gaussian & Neural Accelerator (Intel® GNA)

Operating Systems:

  • Ubuntu* 18.04 long-term support (LTS), 64-bit
  • Windows* 10, 64-bit

Intel® VPU Processors

Intel® Vision Accelerator Design with Intel® Movidius™ Vision Processing Units (VPU)

Operating Systems:

  • Ubuntu* 18.04 long-term support (LTS), 64-bit (Linux Kernel 5.2 and below)
  • Windows* 10, 64-bit
  • CentOS* 7.6, 64-bit

Intel® Movidius™ Neural Compute Stick and Intel® Neural Compute Stick 2

Operating Systems:

  • Ubuntu* 18.04 long-term support (LTS), 64-bit
  • CentOS* 7.6, 64-bit
  • Windows* 10, 64-bit
  • Raspbian* (target only)

AI Edge Computing Board with Intel® Movidius™ Myriad™ X C0 VPU, MYDX x 1

Operating Systems:

  • Windows* 10, 64-bit

Components Used in Validation

Operating systems used in validation:

  • Linux* OS
    • Ubuntu 18.04.3 with Linux kernel 5.3
      •  Ubuntu 18.04.3 with Linux kernel 5.6 for 10th Generation Intel® Core™ Processors (formerly codenamed Ice Lake ) and 11th Generation Intel® Core™ Processor Family for Internet of Things (IoT) Applications  (formerly codenamed Tiger Lake)
    • Ubuntu 20.04 with Linux kernel 5.4
    • CentOS 7.6 with Linux kernel 5.3
    • A Linux* OS build environment needs these components:
      • GNU Compiler Collection (GCC)* 4.8 (CentOS 7), 7.5 (Ubuntu 18), 9.3 (Ubuntu 20)
      • CMake* 3.10 or higher
      • Python* 3.6-3.7, additionally 3.8 for Ubuntu 20
      • OpenCV 4.5
      • Intel® Graphics Compute Runtime. Required only for GPU.
        • 19.41
        • 20.35 for 10th Generation Intel® Core™ Processors (formerly codenamed Ice Lake ) and 11th Generation Intel® Core™ Processor Family for Internet of Things (IoT) Applications  (formerly codenamed Tiger Lake)
  • Windows 10 version 1809 (known as Redstone 5)
  • macOS 10.15

DL frameworks used for validation:

  • TensorFlow 1.15.2, 2.2.0 (limited support according to product features)
  • MxNet 1.5.1

Note Building samples and demos from the Intel® Distribution of OpenVINO™ toolkit package requires CMake* 3.10 or higher.

Helpful Links

Note Links open in a new window.

 

Legal Information

You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps.

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at http://www.intel.com/ or from the OEM or retailer.

No computer system can be absolutely secure.

Intel, Arria, Core, Movidius, Xeon, OpenVINO, and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos

*Other names and brands may be claimed as the property of others.

Copyright © 2020, Intel Corporation. All rights reserved.

For more complete information about compiler optimizations, see our Optimization Notice.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.