Release Notes for Intel® Distribution of OpenVINO™ Toolkit 2021

By Andrey Zaytsev, Alina Alborova

Published: 10/06/2020   Last Updated: 04/01/2021

Note For the Release Notes for the 2020 version, refer to Release Notes for Intel® Distribution of OpenVINO™ toolkit 2020

Introduction

The Intel® Distribution of OpenVINO™ toolkit is a comprehensive toolkit for quickly developing applications and solutions that solve a variety of tasks including emulation of human vision, automatic speech recognition, natural language processing, recommendation systems, and many others. Based on latest generations of artificial neural networks, including Convolutional Neural Networks (CNNs), recurrent and attention-based networks, the toolkit extends computer vision and non-vision workloads across Intel® hardware, maximizing performance. It accelerates applications with high-performance, AI and deep learning inference deployed from edge to cloud.

The Intel® Distribution of OpenVINO™ toolkit:

  • Enables deep learning inference from the edge to cloud.
  • Supports heterogeneous execution across Intel accelerators, using a common API for the Intel® CPU, Intel® Integrated Graphics, Intel® Gaussian & Neural Accelerator, Intel® Neural Compute Stick 2, Intel® Vision Accelerator Design with Intel® Movidius™ VPUs.
  • Speeds time-to-market through an easy-to-use library of CV functions and pre-optimized kernels.
  • Includes optimized calls for CV standards, including OpenCV* and OpenCL™.

New and Changed in the Release 4 LTS

Executive Summary

  • This release has a separate Release Notes - OpenVINO v.2021.4 LTS and further updates for LTS will be posted on that page.

New and Changed in the Release 3

Major Features and Improvements

  • Upgrade to the latest version for new capabilities and performance improvements.
  • Introduces a preview of Conditional Compilation (available in open-source distribution) which enables a significant reduction to the binary footprint of the runtime components (Inference Engine linked into applications) for particular models.
  • Introducing support for the 3rd Gen Intel® Xeon® Scalable platform (code-named Ice Lake), which delivers advanced performance, security, efficiency, and built-in AI acceleration to handle unique workloads and more powerful AI.
  • New pre-trained models and support for public models to streamline development:
    • Pre-trained Models: machine-translation, person-vehicle-bike-detection, text-recognition and text-to-speech.
    • Public Models: aclnet-int8 (sound_classification), deblurgan-v2 (image_processing), fastseg-small and fastseg-large (semantic segmentation) and more.
  • Developer tools now available as Python wheel packages using pip install openvino-dev on Windows*, Linux*, and macOS* for easy package installation and upgrades.

Support Change and Deprecation Notices

  • Toolkit component deprecation notice: Intel® Media SDK

    Deprecation Begins March 23, 2021
    Removal Date October 2021
    • Starting with the Intel® Distribution of OpenVINO™ toolkit 2021.3 release, Intel® Media SDK is being deprecated for removal in October 2021.
    • Users are recommended to migrate to the Intel oneAPI Video Processing Library (oneVPL) as the unified programming interface for video decoding, encoding and processing to build portable media pipelines on CPUs, GPUs, and accelerators. Please note the differences and changes in APIs and functionality.
    • Intel® Distribution of OpenVINO™ toolkit will support the Intel one API Video Processing Library (oneVPL) as the replacement for Intel® Media SDK.
    • See oneVPL Programming Guide for guidelines, migration guide from Intel® Media SDK to oneVPLand API Changes Documentation for reference.
  • Operating system deprecation notice: CentOS*

    Deprecation Begins March 23, 2021
    Removal Date October 2021
    • Intel® Distribution of OpenVINO™ toolkit will continue support for Red Hat Enterprise Linux (RHEL) 8 and will drop support for CentOS in new releases starting 2022.1 (October 2021).
    • See System Requirements for a complete list of supported hardware and operating systems.

       

      2021.3 (this release)

      2021.4 LTS

      2022.1 (October 2021)

      Supported

      CentOS 7

      RHEL 8

      CentOS 7

      RHEL 8

      RHEL 8

       

  • Operating system support change notice: Ubuntu* 18.0x

    Change Notice Begins March 23, 2021
    Support Change Date October 2021
    • Ubuntu 18.0x will be shifted to supported with limitations.  New Intel hardware launched from the 2022.1 release and beyond will not be supported in Ubuntu 18.0x.

    • Starting 2022.1 (October 2021), the new recommended operating system version will be Ubuntu 20.0x.
    • See System Requirements for more info (Recommended Configuration will be marked as bolded starting 2022.1 release).
  • Framework deprecation notice: TensorFlow* 1.x

    Change Notice Begins March 23, 2021
    Support Change Date October 2021
    • TensorFlow 1.x moves to supported with limitations. Any Intermediate Representation (IR) files for TensorFlow 1.x models that were created with OpenVINO version 2020.4 or later are still supported. It will be still possible to convert TensroFlow 1.x models to Intermediate Representation (IR) using latest OpenVINO.
    • Due to TensorFlow’s deprecation of 1.x, generating new IR files for TensorFlow 1.x models requires the use of Python 3.6 with NumPy version 1.19.2.
    • The Intel® Distribution of OpenVINO™ toolkit will continue to support NumPy version 1.19.2 in the 2021.4 LTS release in June 2021. LTS support for 2021.4 extends until June 2023. Starting with release 2022.1 in October 2021, NumPy 1.19.2 will no longer be supported.
    • Users are recommended to upgrade to TensorFlow 2.x or higher or use TensorFlow 1.x models in scenarios above.
    • See Migrate your TensorFlow 1 code to TensorFlow 2 documentation and Converting a TensorFlow Model documentation for guidelines.

Model Optimizer

Model Optimizer

Common changes

  • Updated urllib3 version to be “urllib3>=1.25.9” in the requirements to avoid security issue which exists in 1.25.8 version of the component.
  • Improved reshape capabilities for the models with StridedSlice operation in ShapeOf subgraphs
  • Added framework tensor names to output ports in IR.
  • Implemented new way to enable/disable transformations in MO using fully defined class name. Tu chieve that just put fully defined class name to `MO_ENABLED_TRANSFORMS` (`MO_DISABLED_TRANSFORMS`) variable (ex. `extensions.back.NonmalizeToNormalizeL2.NormalizeToNormalizeL2`).
  • Implemented sending of basic telemetry data for Model Optimizer(user started the MO, model conversion result, MO version)

ONNX*

Added support for the following operations:

  • GatherElements - 11, 13

TensorFlow*

  • CTCGreedyDecoder is now converted to OpenVINO operation CTCGreedyDecoderSeqLen. Refer to OpenVINO operation specification to know more.

MXNet*

  • Added support for the following operations:
    • take

TensorFlow 2*

  • Added support for the TensorFlow 2.X Object Detection API models including 8 EfficientDet models, 10 SSD models, 11 Faster R-CNN models and Mask R-CNN model.
  • Added support for 'while_loop' operation with shape invariant inputs, i.e. the initial shape of each tensor in the loop variables is the same in every iteration. Note that 'while_loop' operation is supported only in RNN context and may not work in different context.
  • Added support for the Keras operation set excluding ConvLSTM2D, MultiHeadAttention, Masking operation, and SELU activation function

Inference Engine

Inference Engine

Common changes

  • CNNNetwork::getOVNameForTensor() API which was introduced to the Inference Engine, allows to use framework tensor name in order to get OpenVINO input/output name. This API works properly only for ONNX models, the set of supported frameworks will be extended in the next release.
  • Added InferRequest::Cancel method to cancel inference request execution. The feature is supposed only for CPU plugin currently. 
  • CNNNetwork::Serialize() method supports serialization to v10 IR format.
  • Deprecated API:

    • InferenceEngine::IVariableState interface is deprecated, use InferenceEngine::VariableState C++ wrapper

Inference Engine Python API

Inference Engine Python API

  • Added support for setting FP16 blobs
  • Added FP64 data type

CPU plugin

CPU plugin

  • The plug-in was migrated to oneDNN v1.6. This allowed to deliver new features support (e.g. AVX-VNNI Instruction set architecture (ISA) support) and performance optimizations for existing pipelines (e.g. int8 inference on legacy HW w/o AVX512 support).
  • Supported Conditional Compilation feature in the plug-in itself and oneDNN fork. The feature allows to decrease CPU plug-in library size several times for particular user scenarios.
  • Delivered BF16 inference pipeline enhancements. Many of operations were extended to directly support BF16 precision. Combined with the integration of new features from oneDNN v1.6 the effort resulted in big average performance gain comparing with the 2021.2 OV release.
  • Added support for new operations:
    • MVN-6
    • GatherElements-6
    • CTCGreedyDecoderSeqLen-6
    • ROIAlign-3
  • Implemented multiple optimizations for operations: Split, Pad, MVN.
  • Significantly decreased memory consumption in throughput scenario for quantized models. 
  • Supported InferRequest::Cancel method which allows to interrupt inference request execution at the middle stages.

GPU plugin

GPU plugin

  • Added support for the following operations:
    • MVN-6
    • CTCGreedyDecoderSeqLen-6
    • ScatterElementsUpdate-3
    • ScatterUpdate-3
    • Broadcast-3
  • Performance improvements:
    • Fine-tuning of int8 and fp16 convolution kernels for gen12lp GPUs
    • Optimizations for Reduce kernel over all spatial dimensions
    • Optimizations for Transpose operation for NCHW → NHWC cases
  • Load time improvements

MYRIAD plugin

MYRIAD plugin

  • Added the support for new operations:
    • HSwish
    • GatherND
    • Interpolate
    • Ceil
  • Added "bidirectional" mode for Broadcast operation.
  • Added second optional output for Proposal operation.
  • Improved the performance of existing operations:
    • Mish
    • Swish
    • NonMaxSuppression

HDDL plugin

HDDL plugin

  • Same new operations and optimizations as in the MYRIAD plugin.
  • Enabled Linux kernel 5.4 support for ION driver.

GNA plugin

GNA plugin

  • Model export now saves layer names, so they can be reused after import.
  • Fixed some layer combinations.

nGraph

nGraph

  • Introduced opset6. The new opset contains the new operations listed below. Not all OpenVINO™ toolkit plugins support the operations.
    • MVN-6
    • GatherElements-6
    • CTCGreedyDecoderSeqLen -6
    • ExperimentalDetectronTopKROIs -6
    • ExperimentalDetectronGenerateProposalsSingleImage-6
    • ExperimentalDetectronDetectionOutput-6
    • ExperimentalDetectronPriorGridGenerator-6
    • ExperimentalDetectronROIFeatureExtractor-6
  • Public nGraph API changes

    • add_parameters/remove_parameter methods were added

  • ONNX Importer changes
    • The number of headers in the public API directory reduced to the required minimum (others have been moved to the "src" directory)
    • New operators support: GatherElements, ReduceSum (opset 13), ExperimentalDetectron and PriorBoxClustered (non-standard ops), BitShift
    • Bert Squad opset 10 support
    • ONNX dependency updated to v1.8.0

Post-Training Optimization Tool (POT)

Post-Training Optimization Tool

  • Added an optional Layer-wise Finetuning mechanism for INT8 quantization which helps to improve the accuracy of quantized models. The mechanism is enabled by 'use_layerwise_tuning' parameter. ​
  • Introduced a new representation for quantized weights to address related changes in Model Optimizer.​
  • Implemented performance optimizations for quantized mobilenetv3  and hbonet  models.
  • Added INT8 quantization sample with the AccuracyAware algorithm for MobileNetV1 FPN model through POT SW API.
  • Extended models coverage: +44 models enabled.

Neural Networks Compression Framework (NNCF)

  • Integrated NNCF with OTE for Instance Segmentation case.
  • Released NNCF v1.6.0 for PyTorch:
    • Added UNet@Mapillary (25%) and SSD300@VOC (40%) as filter pruned sample models
    • Batch norm adaptation by default for all compression algorithms
    • ONNX domain set to org.openvinotoolkit for custom OpenVINO operations such as "FakeQuantize"
    • Quantization of nn.Embedding and nn.EmbeddingBag weights for CPU
    • Option to optimize logarithms of quantizer scales instead of scales themselves directly
    • Support inserting compression operations as pre-hooks to PyTorch operations
    • Extended the ONNX compressed model exporting interface with an option to explicitly name input and output tensors
    • AutoQ - an AutoML-based mixed-precision initialization mode for quantization

Deep Learning Workbench

Deep Learning Workbench

  • Extended support for Landmark detection and face reidentification use-cases - IR conversion, AC measurement, INT8 calibration, profiling (checked on: sphereface, facenet, arcface, MobileFace, landmarks-regression-retail-0009 models). Dataset support is extended to LFW and VGGFace2 datasets. 
  • Added runtime precision analytics. Now the tool provides runtime precision statistics, precision visualization in runtime graph (Netron), table with precision transitions. 
  • Created and exposed advising system based on precision analysis (set of rules based on model precision). Advising system indicates common performance issues and guides the user towards logical flow (like INT8 calibration etc.).
  • Added Ability to export Accuracy Checker config from DL Workbench to e.g. be used in CLI.
  • Added ability to import external Accuracy Checker config file, with ability to interactively edit it in in DL Workbench.
  • Created and exposed OpenVINO JupyterLab sandbox through DL Workbench. User now can use OpenVINO sandbox with a set of prepared Notebooks (classification, object detection, semantic segmentation, style transfer) with ability to port own model from DL WB to continue experimenting in DL Workbench. 
  • UX improvements for model comparison mode, IR and runtime graph visualization etc.
  • Starting with the Intel® Distribution of OpenVINO™ toolkit 2021.3 release, DL Workbench is available only as a prebuilt Docker image. Reference to DL Workbench is kept in OpenVINO installation, but now pulls pre-built image from DockerHub instead of building it from the package. 

OpenCV*

  • Updated version to 4.5.2.
  • Added support of dynamically loaded parallel-processing backends (prebuilt TBB plugin included).
  • Enabled libva interoperability support.

Samples

  • Added the new Python sample (hello_reshape_ssd), which does inference using object detection networks like SSD-VGG. The sample shows how to use shape inference
  • Extended C sample ( hello_classification_async) to show how to use batch

Open Model Zoo

Open Model Zoo

Extended the Open Model Zoo with additional CNN-pretrained models and pre-generated Intermediate Representations (.xml + .bin). Color coding: replacing 2021.2 models, new, end-of-lifed :

  • Replacing 2021.2 models:
    • human-pose-estimation-0005
    • human-pose-estimation-0006
    • human-pose-estimation-0007
    • instance-segmentation-security-0002
    • instance-segmentation-security-0091
    • instance-segmentation-security-0228
    • instance-segmentation-security-1039
    • instance-segmentation-security-1040
    • text-spotting-0004-detector
    • text-spotting-0004-recognizer-decoder
    • text-spotting-0004-recognizer-encoder
  • New: 
    • machine-translation-nar-en-de-0001
    • machine-translation-nar-de-en-0001
    • person-vehicle-bike-detection-2003
    • person-vehicle-bike-detection-2004
    • text-recognition-0013
    • text-to-speech-en-0001-duration-prediction
    • text-to-speech-en-0001-regression
    • text-to-speech-en-0001-generation
  • End-of-lifed:
    • instance-segmentation-security-0010
    • instance-segmentation-security-0050
    • instance-segmentation-security-0083
    • instance-segmentation-security-1025
    • human-pose-estimation-0002
    • human-pose-estimation-0003
    • human-pose-estimation-0004

The list of public models extended with the support for the following models:

Model Name

Task

Framework

aclnet-int8

Sound Classification

PyTorch

deblurgan-v2 Image Processing PyTorch
densenet-201-tf Classification TensorFlow
dla-34 Classification PyTorch
fastseg-large Semantic Segmentation PyTorch
fastseg-small Semantic Segmentation PyTorch
netvlad-tf Place Recognition TensorFlow
regnetx-3.2gf Classification PyTorch
rexnet-v1-x1.0 Classification PyTorch
ssh-mxnet Object Detection MxNet

 

Restructured Open Model Zoo demo folders to remove <omz_dir>\demos\python_demos folder and locate demos implementations under cpp, cpp_gapi, python each particular demo subfolders. Note, not all OMZ demo have all different implementations.

Added  new  demo applications:

  • Python face_detection_mtcnn_demo
  • Python deblurring_demo
  • Python place_recognition demo 

object_detection_demo extended with support new models, including Yolo-V4, segmentation_demo extended with support new models

Open Model Zoo tools:

  • Model Downloader:
    • Fixed downloading models with archived files when the value of --output_dir is a relative path.
  • Model Converter:

    • Decreased memory usage of the PyTorch-to-ONNX conversion stage (depending on the model, the reduction can be up to ~33%).
    • If a subprocess is terminated by a signal, the name of that signal is now printed.
  • Accuracy Checker: 
    • Added support TensorFlow 2.x version as inference backend
    • Improved subset selection logic with opportunity dump and upload images list
    • Added offline evaluation mode: Separation inference and metric calculation pipeline parts and opportunity to perform them on different machines
    • Annotation saving mechanism provides metadata about conversion step
    • Functionality extended with new tasks coverage: named entity recognition, see in the dark; extended support automatic speech recognition and text to speech translation methods

Deep Learning Streamer

  • Updated gvadetect to include post-processing for YoloV4 models.
  • A new property ‘config’ is introduced in gvatrack to configure the tracking algorithm. It allows developers to specify the maximum number of objects to be tracked thereby reducing compute and increasing throughput. It can also improve accuracy by allowing developers to select whether to retain the tracking ID based on the position of the bounding box even if the detected class changes due to a model inaccuracy.
  • A new property ‘object-class’ is added to gvadetect and gvainference. It provides an ability to run the secondary inference on a particular object class only.
  • Improved performance and portability of pipelines across CPU and GPU by removing the need for videoconvert between decode and inference elements.
  • The samples are updated to accept the command-line options for inference device selection and enabling of FPS counter or output video rendering. This provides more flexibility to run the samples without code modification.
  • Preview in open source repo: Introducing DL Streamer for Windows. Now you can build object detection and object classification pipelines for Windows OS by using DL Streamer. The preview is available on branch ‘preview/support-for-windows’.
  • Preview in open source repo: A new element gvasegment is introduced to perform segmentation. The preview is available on branch ‘preview/segmentation’.

OpenVINO™ Model Server

Model Server

  • Custom Node support for Directed Acyclic Graph Scheduler.  Custom nodes in OpenVINO Model Server simplify linking deep learning models into a complete pipeline even if the inputs and output of the sequential models do not fit. In many cases, output of one model can not be directly passed to another one. The data might need to be analyzed, filtered or converted to a different format. Those operations can not be easily implemented in AI frameworks or are simply not supported. Custom node addresses this challenge. They allow employing a dynamic library developed in C++ or C to perform arbitrary data transformations. 
  • DAG demultiplexing - Directed Acyclic Graph Scheduler allows creating pipelines with Node output demultiplexing into separate sub outputs and branch pipeline execution. It can improve  execution performance and address  scenarios where any number of intermediate batches produced by custom nodes can be processed separately and collected at any graph stage.
  • Exemplary custom node for OCR pipeline - A use case scenario for custom node and execution demultiplexing has been demonstrated in an OCR pipeline https://github.com/openvinotoolkit/model_server/blob/v2021.3/docs/east_ocr.md. It combines east-resnet50 model with CRNN model for complete text detection and text recognition.  This custom node analyses the response of east-resnet50 model. Based on the inference results and the original image, it generates a list of detected boxes for text recognition. Each image in the output will be resized to the predefined target size to fit the next inference model in the DAG pipeline (CRNN) .
  • Support for stateful models -  A stateful model recognizes dependencies between consecutive inference requests. It maintains state between inference requests so that next inference depends on the results of previous ones. OVMS allows now submitting inference requests in a context of a specific sequence. OVMS stored and model state and response the prediction results based on the history of requests from the client.   https://github.com/openvinotoolkit/model_server/blob/develop/docs/stateful_models.md
  • Control API - extended REST API to provide functionality of triggering OVMS configuration updates. Endpoint config/reload initiate applying configuration changes and models reloading. It ensures changes in configuration are deployed in a specific time and also gives confirmation about reload operation status. Endpoint /config reports all served models and their versions. It simplifies the usage model from the client side and connection troubleshooting.
  • Helm chart enhancements - added multiple configuration options for deployment with new scenarios: new model storage classes, Kubernetes resource restrictions, security context. Fixed defected with big-scale deployments.
  • Kubernetes Operator -  enabled OVMS deployments using Kubernetes Operator for OVMS. This offering can be used to simplify management of OVMS services at scale in Openshift and in open source Kubernetes. This offering is to be published in https://operatorhub.io

OpenVINO™ Security Add-on

  • Documented SWTPM-HWTPM quote binding details.
  • Tested with latest 5.0 version of tpm2-tools.
  • Updated command line options to ovsatool as documented in the Getting Started Guide.

New Distributions

  • Python Packages:
    • OpenVINO developer tools are now available as Python wheel packages. To install Inference Engine, Model Optimizer, Post-Training Optimization Tool and the Accuracy Checker utility – simply run  pip install openvino-dev in your Python virtual environment(s). The runtime-only wheel package (pip install openvino) has also been updated and it's no longer necessary to set Windows PATH or LD_LIBRARY_PATH variables on Linux and macOS when using either of the Python packages. The packages can be installed on many versions of Linux and Windows with official support on the following: 

      Supported Operating System

      Python* Version (64-bit)

      Ubuntu* 18.04 long-term support (LTS), 64-bit 3.6, 3.7
      Ubuntu* 20.04 long-term support (LTS), 64-bit 3.6, 3.7
      Red Hat* Enterprise Linux* 8, 64-bit 3.6, 3.7
      CentOS* 7, 64-bit 3.6, 3.7
      macOS* 10.15.x versions 3.6, 3.7, 3.8
      Windows 10*, 64-bit 3.6, 3.7, 3.8
  • Containers:
    • New Ubuntu 20 dev Docker image is available on DockerHub container registry.
      • Includes Inference Engine, OpenCV, samples, demos, Model Optimizer, Post-training Optimization tool, Accuracy checker and Open Model Zoo tools.
      • Supports CPU, GPU, VPU, GNA and HDDL devices.
    • New RHEL 8 runtime Docker image is available on Red Hat Quay.io container registry with CPU, GPU plugins support.
      • Includes Inference Engine and OpenCV.
      • Supports CPU and GPU devices.
    • New Dockerfile to build Inference Engine from source with OpenCV and Open Model Zoo for Ubuntu 18.04.

New and Changed in the Release 2

Executive Summary

  • Integrates the Deep Learning Workbench with the Intel® DevCloud for the Edge as a Beta release. Graphically analyze models using the Deep Learning Workbench on the Intel® DevCloud for the Edge (instead of a local machine only) to compare, visualize and fine-tune a solution against multiple remote hardware configurations.
  • Introduces support for Red Hat Enterprise Linux (RHEL) 8.2. See System Requirements for more info. Runtime package is available for downloading.
  • Introduces per-channel quantization support in the Model Optimizer for models quantized with TensorFlow Quantization-Aware Training containing per-channel quantization for weights, which improves performance by model compression and latency reduction.
  • Pre-trained models and support for public models to streamline development:
    • Public Models: Yolov4 (for object detection), AISpeech (for speech recognition), and DeepLabv3 (for semantic segmentation)
    • Pre-trained Models: Human Pose Estimation (update), Formula Recognition Polynomial Handwritten (new), Machine Translation (update), Common Sign Language Recognition (New), and Text-to-Speech (new)
  • New OpenVINO™ Security Add-on, which controls access to model(s) through secure packaging and execution. Based on KVM Virtual machines and Docker* containers and compatible with the OpenVINO™ Model Server, this new add-on enables packaging for flexible deployment and controlled model access.
  • PyPI project moved from openvino-python to openvino, and 2021.1 version to be removed in the default view. The specific version is still available for users depending on this exact version by using openvino-python==2021.1

Model Optimizer

Model Optimizer

Common changes

  • Updated requirements for the numpy component to avoid compatibility issues with TensorFlow 1.x.
  • Improved reshape-ability of models with eltwise and CTCGreedyDecoder operations

ONNX*

  • Enabled an ability to specify the model output tensor name using the "--output" command line parameter.
  • Added support for the following operations:
    • Acosh
    • Asinh
    • Atanh
    • DepthToSpace-11, 13
    • DequantizeLinear-10 (zero_point must be constant)
    • HardSigmoid-1,6
    • QuantizeLinear-10 (zero_point must be constant)
    • ReduceL1-11, 13
    • ReduceL2-11, 13
    • Resize-11, 13 (except mode="nearest" with 5D+ input, mode="tf_crop_and_resize", and attributes exclude_outside and extrapolation_value with non-zero values)
    • ScatterND-11, 13
    • SpaceToDepth-11, 13

TensorFlow*

  • Added support for the TensorFlow Object Detection API models with pre-processing block when mean/scale values are applied prior to resizing of the image. Previously only the case when mean/scale values are applied after the resize was supported.
  • Aligned FakeQuantized limits adjustment with TensorFlow approach
  • Added support for the following operations:
    • GatherND
    • Round
    • NonMaxSuppression
    • LogSoftmax
    • FakeQuantWithMinMaxVarsPerChannel

MXNet*

  • Added support for the following operations:
    • GatherND
    • Round

Kaldi*

  • Added support for the following operations:
    • TdnnComponent

Inference Engine

Inference Engine

Common changes

  • Removed dependency on inference_engine_legacy. Since 2021.2, customer's application don't have directly link inference_engine_legacy. inference_engine_legacy is linked by plugins directly. 
  • Added support of reading ONNX models which have external data files. To read such models need to pass in the core.ReadNetwork() method only path to ONNX model, external data files will be found and loaded automatically.
  • The logic to detect supported models was improved for the ONNX reader.
  • ONNX dependency updated to v1.7.0
  • Added support for ONNX functions (bottom part of the operators list https://github.com/onnx/onnx/blob/v1.7.0/docs/Operators.md)
  • Improved the documentation and examples about registering custom ops in the ONNX importer
  • setBatchSize method now moved to reshape method logic to update input shapes of the model. Additionally, it applies Smart Reshape transformations that are relaxing some non-reshape-able patterns in the model. Using setBatchSize and reshape methods for the same model is now eligible and won't lead to undefined behavior as it was in previous releases.

  • On Windows platform, Inference Engine libraries have the new "Details"  section in file properties. This section contains information about DLL, including library description and version.

Deprecated API

  • ExecutableNetwork::QueryState method replaced by InferRequest::QueryState method, the old one is deprecated.
  • IVariableState::GetLastState method was renamed to IVariableState::GetState, old one is deprecated
  • IMemoryState was renamed to IVariableState, old name still can be used but not recommended

CPU Plugin

CPU Plugin

  • Added support for new operations:
    • Loop-5
    • Round-5
    • NonMaxSuppression-3, NonMaxSuppression-5
    • HSigmoid-5
    • LogSoftmax-5
    • GatherND-5
  • Implemented multiple optimizations for CTCLoss, Pad, Permute and Elementwise operations. This effort resulted in improved CPU performance on customers models and significantly increased overall performance geomean on Open Model Zoo scope.
  • Added support of I64/U64 data types on dynamic inputs (via internal conversion to I32).
  • State API was improved and now can be used in applications with several parallel infer requests:
    • MKLDNN plugin implementation of IVariableState::GetName() method is fixed and return variable ids now
    • Added support of IVariableState::GetState in MKLDNN plugin

GPU Plugin

GPU Plugin

  • Support for Intel® Iris® Xe MAX Graphics (formerly codenamed DG1) 
  • Added support for the following operations:
    • HSigmoid-5
    • Round-5
    • LogSoftMax-5
  • Performance improvements for int8 convolutions with asymmetric quantization
  • Added mechanism for compiled kernels caching on the plugin side which might be used instead of cl_cache in the driver.

MYRIAD Plugin

MYRIAD Plugin

  • Added the support for new operations:
    • HSwish
    • GatherND
    • Interpolate
    • Ceil
  • Added "bidirectional" mode for Broadcast operation.
  • Added second optional output for Proposal operation.
  • Improved the performance of existing operations:
    • Mish
    • Swish
    • NonMaxSuppression

HDDL Plugin

HDDL Plugin

  • Same new operations and optimizations as in the MYRIAD plugin.
  • Enabled Linux kernel 5.4 support for ION driver.

GNA Plugin

GNA Plugin

  • Model export now saves layer names, so they can be reused after import.
  • Fixed some layer combinations.

nGraph

  • Introduced opset5. The new opset contains the new operations listed below. Not all OpenVINO™ toolkit plugins support the operations.
    • BatchNormInference-5
    • GRUSequence-5
    • RNNSequence-5
    • LSTMSequence-5
    • Loop-5
    • Round-5
    • NonMaxSuppression-5
    • HSigmoid-5
    • LogSoftmax-5
  • Implemented public nGraph transformations:
    • LowLatency
      The transformation unrolls the TensorIterator nodes to infer them step-by-step with low latency and states are stored from inference to inference. The transformation changes number of iterations to 1 and replaces back-edges (e.g. RNN states inputs and outputs) with ReadValue and Assign operations. The transformation is available for CPU and GNA plugins.
  • Public nGraph API changes:
    • The Sink class has been introduced to conveniently identify operations that are "sinks" (nodes are not consumed by any other nodes) of the graph. The nGraph Function API was extended by methods "add/remove sinks". Currently, only Assign nodes are inherited from the Sync class, Result nodes are special and are stored separately, not sinks.

  • Continued clean up of nGraph original codebase from before the integration with the Intel® Distribution of OpenVINO™ toolkit resulting in the removal of legacy operations that are not supported by the toolkit.

Neural Networks Compression Framework (NNCF)

  • Integrated NNCF with OTE/mmdetection for Single-stage Object Detection case.
  • Released NNCF v1.5 for PyTorch:
    • Switched propagation-based mode for quantizer setup by default (better integration with HW configs).
    • Implemented improvements for HAWQ mixed-precision quantization algorithm: compression ratio parameter support, activation quantizers bitwidth selection, more generic way to calculate the loss.
    • Supported unified scales for EltWise through VPU HW Config.
    • Enabled GPT2 compression, added pruned googlenet-v1 to the list of supported models.
    • See NNCF Release Notes for details and full list of features.

Post-Training Optimization Tool

Post-Training Optimization Tool

  • Introduced model presets in the POT configuration and particularly the preset for Transformer models which allows POT users to easier quantize such models.
  • Improved POT documentation including the quantization example. Added the Frequently Asked Questions document. 
  • Extended models coverage: +45 models enabled.

Deep Learning Workbench

Deep Learning Workbench

  • Distribution: DL Workbench is now available in the Intel DevCloud for the Edge
  • Added support for GAN models for style-transfer, super-resolution, and inpainting use cases 
  • Added the ability to export profiling experiment results in CSV format

OpenCV*

  • Updated version to 4.5.1.
  • Added the support for width/height properties in Media SDK (MFX) backend of VideoCapture API.
  • G-API: Added more CV operations, Python bindings for Inference and Streaming APIs, introduced MediaFrame data type for media formats support (for example, NV12).

Samples

  • The order of input layers (for input data files) and output layers (for output and reference files) in command line arguments of the speech sample can now be explicitly specified using new command-line arguments (-iname and -oname).

Open Model Zoo

  • Extended the Open Model Zoo with additional CNN-pretrained models and pregenerated Intermediate Representations (.xml + .bin):

    • Replaced the 2021.1 models:

      • text-spotting-0003-detector
      • text-spotting-0003-recognizer-decoder
      • text-spotting-0003-recognizer-encoder
    • Added new models:
      • bert-small-uncased-whole-word-masking-squad-int8-0002
      • bert-small-uncased-whole-word-masking-squad-emb-int8-0001
      • formula-recognition-polynomials-handwritten-0001-decoder
      • formula-recognition-polynomials-handwritten-0001-encoder
      • handwritten-simplified-chinese-recognition-0001
      • human-pose-estimation-0002
      • human-pose-estimation-0003
      • human-pose-estimation-0004
      • person-detection-0003
    • End-of-lifed models:
      • bert-large-whole-word-masking-squad-fp32-0001 renamed to bert-large-uncased-whole-word-masking-squad-0001
  • The list of public models extended with the support for the following models:

    Model Name

    Task

    Framework

    anti-spoof-mn3

    Classification

    PyTorch

    cocosnet Image Translation PyTorch
    colorization-v2 Image Processing PyTorch
    colorization-siggraph Image Pprocessing PyTorch
    common-sign-language-0001 Classification PyTorch
    efficientdet-d0-tf Object Detection TensorFlow
    efficientdet-d1-tf Object Detection TensorFlow
    forward-tacotron-duration-prediction Text to Speech PyTorch
    forward-tacotron-regression Text to Speech PyTorch
    fcrn-dp-nyu-depth-v2-tf Depth Estimation TensorFlow
    hrnet-v2-c1-segmentation Semantic Segmentation PyTorch
    mozilla-deepspeech-0.8.2 Speech Recognition TensorFlow
    shufflenet-v2-x1.0 Classification PyTorch
    wavernn-rnn Text to Speech PyTorch
    wavernn-upsampler Text to Speech PyTorch
    yolact-resnet50-fpn-pytorch Instance Segementation PyTorch
    yolo-v4-tf Object Detection TensorFlow
  • Replaced old Caffe variants of colorization models with PyTorch variants of the same models.

    • Added new demo applications:
    • Python gesture_recognition_demo (replaces asl_recognition_demo)
    • Python human_pose_estimation_demo (with support of new human-pose-estimation-0002/3/4 models)
    • Python image_translation_demo
    • Python text to speech demo
    • Python object_detection_demo (replaces object_detection_demo_centernet, object_detection_demo_faceboxes, object_detection_demo_retinaface, object_detection_demo_ssd_async and object_detection_demo_yolov3_async)
    • C++ object_detection_demo (replaces object_detection_demo_ssd_async and object_detection_demo_yolov3_async)
  • Removed deprecated object_detection_demo_faster_rcnn.

  • Open Model Zoo tools:

    • Extended the Model Converter with support of a custom preconvert script, which simplifies conversion of non-frozen model graphs.
    • Extended the Accuracy Checker with coverage of new tasks: image based localization, salient map detection, optical flow estimation, DNA sequencing.
    • Added command-line options for setting input precision and getting intermediate metrics results in the Accuracy Checker.
    • Improved work with GAN models in the Accuracy Checker, extended postprocessing, added new metrics - Inception Score and  Frechet Inception Distance.
    • Tensorflow 2.3 is required to convert efficientdet-d0/d1 models.

Deep Learning Streamer

  • Direct ONNX model support: DL Streamer gvadetect, gvaclassify, and gvainference elements will now support ONNX models supported by OpenVINO™ Inference Engine on CPU without converting to Intermediate Representation (IR) format.
  • Full frame and ROI based inference: A new property 'inference-region' added to gvadetect, gvaclassify, and gvainference element will allow developers to run inference on full-frame or ROI (Region of Interest) for use cases such as back-to-back detection, and full-frame classification.
  • Imageless Object tracking: Two new algorithms 'short-term imageless' and 'zero-term imageless' introduced in gvatrack will provide an ability to track the objects without accessing image data. 
  • Docker file updates: The folder structure created with the Docker file in DL Streamer GitHub is aligned with the Docker image released by OpenVINO™ on DockerHub*. Developers can now use the same instructions and guidelines for using DL Streamer regardless of the chosen way of distribution  (OpenVINO Installer, OpenVINO Docker image, DL Streamer Docker file, building from source).

For more information on DL Streamer, see the DL Streamer tutorial, API reference, and samples located at the DL Streamer open-source project repository OpenVINO™ Toolkit - DL Streamer on GitHub. The documentation for samples is also available at DL Streamer Samples.

OpenVINO™ Model Server

Model Server

  • Directed Acyclic Graph (DAG) scheduler – (formerly `models ensemble`) this feature was first available as a preview in 2021.1. It is now officially supported, making it possible to define inference pipelines composed of multiple interconnected models that respond to a single prediction request. In this release we are adding support for remaining API calls which were not supported for DAGs in the preview, specifically `GetModelStatus` and `GetModelMetadata`. `GetModelStatus` returns the status of the complete pipeline while GetModelMetadata returns the pipeline inputs and outputs parameters. The new 2021.2 release has improved DAG config validation.
  • Direct import of ONNX models – it is now possible to serve ONNX models without converting to Intermediate Representation (IR) format. This helps simplify deployments using ONNX models and the PyTorch training framework.
  • Custom loaders and integration with OpenVINO™ Security Add-on – it is now possible to define a custom library to handle model loading operations – including additional steps related to model decryption and license verification. Review the documentation of the Security Add-on component to learn about model protection.
  • Traffic Encryption – new deployment recipe for client authorization via mTLS certificates and traffic encryption by integrating with NGINX reverse proxy in a Docker container. 
  • Remote Model Caching from cloud storage – models stored in Google Cloud Storage (GCS), Amazon S3 and Azure blob will no longer be downloaded multiple times after configuration changes that require model reloading. Cached model(s) will be used during the model reload operation. When a served model is changed, only the corresponding new version folder will be added to the model storage.

OpenVINO™ Security Add-on

OpenVINO™ Security Add-on

The Security Add-on is a  set of tools that allows a model developer to control access to models post-development, and check for access to models at run time within a controlled environment using the OpenVINO™ Model Server. It consists of a set of development tools to define access controls to a model, a licensing service to check model license prior to loading into model server and an isolated environment within which an access controlled model can be executed within the OpenVINO™ Model Server.

 Key Features of Security Add-on

  • Define access controls to models soon after development.
  • Generate customer specific licenses limiting days of model use.
  • Check the license validity before loading the model into OpenVINO™ Model Server.
  • Execute models in an isolated environment via KVM Virtual Machines, with the OpenVINO™ Model Server.
  • Control application access to models via NGINX.

New and Changed in the Release 1

Executive Summary

  • Introducing a major release in October 2020 (v.2021). You are highly encouraged to upgrade to this version because there it introduces new and important capabilities, as well as breaking changes and backward-incompatible changes. 
  • Support for TensorFlow 2.2.x. Introduces official support for models trained in the TensorFlow 2.2.x framework.
  • Support for the Latest Hardware. Introduces official support for 11th Generation Intel® Core™ Processor Family for Internet of Things (IoT) Applications (formerly codenamed Tiger Lake) including new inference performance enhancements with Intel® Iris® Xe Graphics and Intel® DL Boost instructions, as well as Intel® Gaussian & Neural Accelerators 2.0 for low-power speech processing acceleration.
  • Going Beyond Vision. Enables end-to-end capabilities to leverage the Intel® Distribution of OpenVINO™ toolkit for workloads beyond computer vision, which include audio, speech, language, and recommendation, with new pre-trained models, support for public models, code samples and demos, and support for non-vision workloads in OpenVINO™ toolkit DL Streamer.
  • Coming in Q4 2020: (Beta Release) Integration of DL Workbench and the Intel® DevCloud for the Edge. Developers can now graphically analyze models using the DL Workbench on Intel® DevCloud for the Edge (instead of a local machine only) to compare, visualize and fine-tune a solution against multiple remote hardware configurations.
  • OpenVINO™ Model ServerAn add-on to the Intel® Distribution of OpenVINO™ toolkit and a scalable microservice, which provides a gRPC or HTTP/REST endpoint for inference, makes it easier to deploy models in cloud or edge server environments. It is now implemented in C++ to enable reduced container footprint (for example, less than 500MB) and deliver higher throughput and lower latency.
  • Now available through Gitee* and PyPI* distribution methods. You are encouraged to choose from the distribution methods and download. 

Backward Incompatible Changes Compared with 2020.4

  • List of Deprecated APIAPI Changes
  • IRv7 has been deprecated since 2020.3, and it is no longer supported in this release. You cannot read IRv7 and lower Core::ReadNetwork and are recommended to migrate to IRv10, the highest version. IRv10 provides a streamlined and future-ready operation set that is aligned with public frameworks along with better support for low precision models representation in order to keep accuracy when running in the quantized mode as well as the support for reshapeable models.
  • Removed the Inference Engine NNBuilder API. Use nGraph instead to create a CNN graph from C++ code.
  • Removed the following Inference Engine public API:
    • InferencePlugin, IInferencePlugin, and InferencEnginePluginPtr classes. Use the Core class instead.
    • PluginDispatcher class. Use the Core class instead.
    • CNNNetReader class. Use Core::ReadNetwork instead.
    • PrimitiveInfo, TensorInfo and ExecutableNetwork::GetMappedTopology. Use ExecutableNetwork::GetExecGraphInfo instead.
    • ICNNNetworkStats, NetworkNodeStats, CNNNetwork::getStats and CNNNetwork::setStat. Use IRv10 with FakeQuantize approach for INT8 flow replacement.
    • IShapeInferExtension and CNNNetwork::addExtension. Use IExtension class as a container for nGraph::Nodes which implement shape inference.
    • IEPlugin from the Inference Engine Python API. Use the Core API instead.
    • Data::getCreatorLayer, Data::getInputTo and CNNLayer. Use CNNNetwork::getFunction to iterate over a graph.
  • Starting with the OpenVINO™ toolkit 2020.2 release, all of the features previously available through nGraph have been merged into the OpenVINO™ toolkit. As a result, all the features previously available through the ONNX RT Execution Provider for nGraph have been merged with the ONNX RT Execution Provider for OpenVINO™ toolkit. Therefore, the ONNX RT Execution Provider for nGraph will be deprecated starting June 1, 2020 and will be completely removed on December 1, 2020. Migrate to the ONNX RT Execution Provider for the OpenVINO™ toolkit as the unified solution for all AI inferencing on Intel® hardware.
  • Deprecated or removed the following nGraph public API:
    • Removed all nGraph methods and classes, which have been deprecated in previous releases.
    • Removed the GetOutputElement operation.
    • Replaced copy_with_new_args() by clone_with_new_inputs().
    • Removed opset0 and back propagation operations.
    • Removed some operations from opset0, deprecated operations from the opset, which are not used in newer opsets.
    • Removed the support for the serialization nGraph function to the JSON format.
    • Deprecated FusedOp.
  • Changed the structure of the nGraph public API. Removed nGraph builders and reference implementations from nGraph public API. Joined subfolders that have fused and experimental operations with the common operation catalog.
  • Changed the System Requirements. Review the section below.
  • Intel® will be transitioning to the next-generation programmable deep-learning solution based on FPGAs in order to increase the level of customization possible in FPGA deep-learning. As a part of this transition, future standard, that is non-LTS, releases of the Intel® Distribution of OpenVINO™ toolkit will no longer include the Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA. Intel® Distribution of OpenVINO™ toolkit 2020.3.X LTS release will continue to support Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA. For questions about next-generation programmable deep-learning solutions based on FPGAs, talk to your sales representative or contact us to get the latest FPGA updates.

Model Optimizer

Model Optimizer

Common changes

  • Implemented several optimization transformations to replace sub-graphs of operations with HSwish, Mish, Swish and SoftPlus operations.
  • Model Optimizer generates IR keeping shape-calculating sub-graphs by default. Previously, this behavior was triggered if the "--keep_shape_ops" command line parameter was provided. The key is ignored in this release and will be deleted in the next release. To trigger the legacy behavior to generate an IR for a fixed input shape (folding ShapeOf operations and shape-calculating sub-graphs to Constant), use the "--static_shape" command line parameter. Changing model input shape using the Inference Engine API in runtime may fail for such an IR.
  • Fixed Model Optimizer conversion issues resulted in non-reshapeable IR using the Inference Engine reshape API.
  • Enabled transformations to fix non-reshapeable patterns in the original networks:
    • Hardcoded Reshape
      • In Reshape(2D)->MatMul pattern
      • Reshape->Transpose->Reshape when the pattern can be fused to the ShuffleChannels or DepthToSpace operation
    • Hardcoded Interpolate
      • In Interpolate->Concat pattern
  • Added a dedicated requirements file for TensorFlow 2.X as well as the dedicated install prerequisites scripts.
  • Replaced the SparseToDense operation with ScatterNDUpdate-4.

ONNX*

  • Enabled an ability to specify the model output tensor name using the "--output" command line parameter.
  • Added support for the following operations:
    • Acosh
    • Asinh
    • Atanh
    • DepthToSpace-11, 13
    • DequantizeLinear-10 (zero_point must be constant)
    • HardSigmoid-1,6
    • QuantizeLinear-10 (zero_point must be constant)
    • ReduceL1-11, 13
    • ReduceL2-11, 13
    • Resize-11, 13 (except mode="nearest" with 5D+ input, mode="tf_crop_and_resize", and attributes exclude_outside and extrapolation_value with non-zero values)
    • ScatterND-11, 13
    • SpaceToDepth-11, 13

TensorFlow*

  • Added support for the following operations:
    • Acosh
    • Asinh
    • Atanh
    • CTCLoss
    • EuclideanNorm
    • ExtractImagePatches
    • FloorDiv

MXNet*

  • Added support for the following operations:
    • Acosh
    • Asinh
    • Atanh

Kaldi*

  • Fixed bug with ParallelComponent support. Now it is fully supported with no restrictions.

Inference Engine

Inference Engine

Common changes

  • Migrated to the Microsoft Studio C++ (MSVC) 2019 Compiler as default for Windows, which enables you to 2.5x reduce the binary size of OpenVINO™ runtime. See Reduce Application Footprint with the Latest Features in Intel® Distribution of OpenVINO™ toolkit for details.
  • See the Deprecation messages and backward-incompatible changes compared with v.2020 Release 4 section for detailed changes in the API.
  • Ported the CPU-based preprocessing path, namely resizing for different numbers of channels, layout conversions, and color space conversions, to AVX2 and AVX512 instruction sets.

Inference Engine Python API

Inference Engine Python API

  • Enabled the nGraph Python API, which enables communicating with the nGraph function using Python. This enables performing analysis of the loaded graph.
  • Enabled setting parameters of the nodes for the graph. 
  • Enabled reading ONNX models with the Python API.

Inference Engine C API

Inference Engine C API

  • No changes

CPU Plugin

CPU Plugin

  • Improved performance of CPU plugin built with MSVC compiler to align with the version built with the Intel® compiler, which enables the use of MSVC as the default compiler for binary distribution on Windows. This change resulted in more than 2x binary size reduction for CPU plugin and other components. See Reduce Application Footprint with the Latest Features in Intel® Distribution of OpenVINO™ toolkit for details.
  • Added the support for new operations:
    • ScatterUpdate-3
    • ScatterElementsUpdate-3
    • ScatterNDUpdate-3
    • Interpolate-4
    • CTC-Loss-4
    • Mish-4
    • HSwish-4

GPU Plugin

GPU Plugin

  • Support for 11th Generation Intel® Core™ Processor Family for Internet of Things (IoT) Applications (formerly codenamed Tiger Lake) 
  • Support for INT8 inference pipeline with optimizations based on Intel® DL Boost for integrated graphics.
  • Support for new operations:
    • Mish
    • Swish
    • SoftPlus
    • HSwish

MYRIAD Plugin

MYRIAD Plugin

  • Added the support for ONNX Faster R-CNN with fixed input shape and dynamic outputs shapes.
  • Added the support for automatic-DMA for custom OpenCL layers.
  • Added the support for new operations:
    • Mish
    • Swish
    • SoftPlus
    • Gelu
    • StridedSlice
    • I32 data type support in Div
  • Improved the performance of existing operations:
    • ROIAlign
    • Broadcast
    • GEMM
  • Added a new option VPU_TILING_CMX_LIMIT_KB to myriad_compile that enables limiting DMA transaction size.
  • OpenCL compiler, targeting Intel® Neural Compute Stick 2 for the SHAVE* processor only, is redistributed with OpenVINO. OpenCL support is provided by ComputeAorta*, and is distributed under a license agreement between Intel® and Codeplay* Software Ltd.

HDDL Plugin

HDDL Plugin

  • Supported automatic-DMA for custom OpenCL layers.
  • Same new operations and optimizations as in the MYRIAD plugin.
  • OpenCL compiler, targeting Intel® Vision Accelerator Design with Intel® Movidius™ VPUs for the SHAVE* processor only, is redistributed with OpenVINO. OpenCL support is provided by ComputeAorta*, and is distributed under a license agreement between Intel and Codeplay* Software Ltd.

GNA Plugin

GNA Plugin

  • Added the support for 11th Generation Intel® Core™ Processor Family for Internet of Things (IoT) Applications (formerly codenamed Tiger Lake).
  • Added the support for a number of additional layers and layer combinations, including:
    • Convolution layers for models prepared from TensorFlow framework
    • Power layer with the power parameter different from 1
    • Concat layer with the number of input layers greater than 2
    • 4D element-wise operations
  • Added the support for importing a model from a stream.
  • Added the support for the QoS mechanism on Windows.
  • Added the support for GNA-specific parameters in the Python Benchmark Tool.

nGraph

  • Introduced opset4. The new opset contains the new operations listed below. Not all OpenVINO™ toolkit plugins support the operations.
    • Acosh-4
    • Asinh-4
    • Atanh-4
    • CTCLoss-4
    • HSwish-4
    • Interpolate-4
    • LSTMCell-4
    • Mish-4
    • Proposal-4
    • Range-4
    • ReduceL1-4
    • ReduceL2-4
    • ScattenNDUpdate-4
    • SoftPlus-4
    • Swish-4
  • Enabled the nGraph Python API, which enables communicating with the nGraph function using Python. This enables you to perform analysis of a loaded graph.
    • Enabled setting parameters of the nodes for the graph. 

    • Enabled reading ONNX models with the Python API.

  • Refactored the nGraph Transformation API to give it a transparent structure and make it more user-friendly. Read more in the nGraph Developer's Guide.
  • Changed the structure of the nGraph folder. nGraph public API was separated from the rest of the code, ONNX importer was moved to the frontend folder.

Neural Networks Compression Framework (NNCF)

  • Released NNCF v1.4 for PyTorch:
  • Enabled exporting pruned models to ONNX.
  • Added the support for FP16 fine-tuning for quantization.
  • Added the support for the BatchNorm adaptation as a common compression algorithm initialization step.
  • Improved the performance for per-channel quantization training. Performance is almost on par with per-tensor training.
  • Enabled default quantization of nn.Embedding and nn.Conv1d weights.
  • See NNCF Release Notes for details.

Post-Training Optimization Tool

Post-Training Optimization Tool

  • Enabled auto-tuning of quantization parameters in the Accuracy Aware algorithm.
  • Accelerated the Honest Bias Correction algorithm. The average boost of quantization time is ~4x comparing to 2020.4 for cases when 'use_fast_bias' = false.
  • Productized the Post-training Optimization Toolkit API. Provided samples and documentation to show how to use the API, which covers:
    • Integration to a user’s pipeline
    • Custom data loader, metric calculation, and execution engine
  • Default quantization scheme corresponds to compatibility mode, which needs to provide almost the same accuracy across different hardware.
  • Extended models coverage: enabled new 44 models.

Deep Learning Workbench

Deep Learning Workbench

  • Enabled import and profiling of  pretrained TensorFlow2.0 models. 
  • Enabled INT8 calibration using different presets exposed by the POT.
  • Enabled INT8 calibration on remote targets.
  • Improved visualization of an IR and a runtime graph, including graph interactions and heat maps. 
  • Added visualization of inference results in an image of user choice. The feature is in experimental mode.

OpenCV*

  • Updated version to 4.5.0.
  • Changed the upstream license to Apache 2 (PR#18073).
  • Added the support for multiple OpenCL contexts in OpenCV applications.

Samples

  • Updated Inference Engine C++ Samples to demonstrate how to load ONNX* models directly.

Open Model Zoo

  • Extended the Open Model Zoo with additional CNN-pretrained models and pregenerated Intermediate Representations (.xml + .bin):

    • Replaced the 2020.4 models:

      • face-detection-0200
      • face-detection-0202
      • face-detection-0204
      • face-detection-0205
      • face-detection-0206
      • person-detection-0200
      • person-detection-0201
      • person-detection-0202
      • person-reidentification-retail-0277
      • person-reidentification-retail-0286
      • person-reidentification-retail-0287
      • person-reidentification-retail-0288
    • Added new models:
      • bert-large-uncased-whole-word-masking-squad-emb-0001
      • bert-small-uncased-whole-word-masking-squad-0002
      • formula-recognition-medium-scan-0001-im2latex-decoder
      • formula-recognition-medium-scan-0001-im2latex-encoder
      • horizontal-text-detection-0001
      • machine-translation-nar-en-ru-0001
      • machine-translation-nar-ru-en-0001
      • person-attributes-recognition-crossroad-0234
      • person-attributes-recognition-crossroad-0238
      • person-vehicle-bike-detection-2000
      • person-vehicle-bike-detection-2001
      • person-vehicle-bike-detection-2002
      • person-vehicle-bike-detection-crossroad-yolov3-1020
      • vehicle-detection-0200
      • vehicle-detection-0201
      • vehicle-detection-0202
    • End-of-lifed models:
      • face-detection-adas-binary-0001
      • pedestrian-detection-adas-binary-0001
      • vehicle-detection-adas-binary-0001
  • The list of public models extended with the support for the following models:

    Model Name

    Framework

    aclnet

    PyTorch

    resnest-50 PyTorch
    mozilla-deepspeech-0.6.1 Tensorflow
    yolo-v3-tiny-tf Tensorflow
  • Added new demo applications:
    • bert_question_answering_embedding_demo, Python
    • formula_recognition_demo, Python
    • machine_translation_demo, Python
    • sound_classification_demo, Python
    • speech_recognition_demo, Python 
  • Open Model Zoo tools:
    • Improved the downloader speed.
    • Added the Accuracy Checker config files to each model folder. For compatibility, in their old location soft links are kept to the new location. In future releases, soft links will be removed.
    • Simplified the Accuracy Checker configuration files, no need to specify the path to the model IR or target device and precision in a configuration file. Apply these parameters as Accuracy Checker command-line options. See details in the instruction on how to use predefined configuration files.
    • Extended the Accuracy Checker with the support for optimized preprocessing operations via the Inference Engine preprocessing API.
    • Enabled ONNX models evaluation in the Accuracy Checker without conversion to the IR format.

Deep Learning Streamer

  • Expanded the DL Streamer beyond video by adding the support for audio analytics. Added a new element gvaaudiodetect for audio event detection using the AclNet model. Added an end-to-end sample of the pipeline to the samples folder.
  • Added a new element gvametaaggregate to combine the results from multiple branches of a pipeline. This enables the creation of complex pipelines by splitting a pipeline into multiple branches for parallel processing and then combining the results from various branches. 
  • Enabled GPU memory surface sharing, namely zero-copy of data, between VAAPI decode, resize, CSC, and DL Streamer inference elements on GPU to improve the overall pipeline performance.
  • Enabled GPU memory at input and output of gvatrack and gvawatermark elements, thereby removing the need to explicitly convert the memory from GPU to CPU using vaapipostproc when inference is performed on GPU. This not only makes the pipelines portable between the devices with and without GPU but also improves the performance due to the removal of a memory copy step.
  • [Preview] Extended the DL Streamer OS support to Ubuntu 20.04. On Ubuntu 20.04, the DL Streamer will use the GStreamer and its plugins provided by the OS, and thus you have access to all the elements provided by the GStreamer default installation on Ubuntu 20.04.

For more information on DL Streamer, see the DL Streamer tutorial, API reference, and samples documentation at OpenVINO™ Inference Engine Samples and a new home for the DL Streamer open source project located at OpenVINO™ Toolkit - DL Streamer repository on GitHub.

OpenVINO™ Model Server

Model Server

Model Server is a scalable high-performance tool for serving models optimized with OpenVINO™. It provides an inference service via a gRPC or HTTP/REST endpoint, enabling you to bring your models to production quicker without writing custom code.

Key Features and Enhancements

  • Improved scalability in a single server instance. With the new C++ implementation, you can use the full capacity of available hardware with linear scalability while avoiding any bottleneck on the frontend.
  • Reduced the latency between the client and the server. This is especially noticeable with high-performance accelerators or CPUs.
  • Reduced footprint. By switching to C++ and reducing dependencies, the Docker image is reduced to ~450MB.
  • Added the support for online model updates. The server monitors configuration file changes and reloads models as needed without restarting the service.

For more information about the Model Server, see the open source repo and the Model Server Release Notes. Prebuilt Docker images are available at openvino/model_server

Preview Features Terminology

A preview feature is a functionality that is being introduced to gain early feedback from developers. You are encouraged to submit your comments, questions, and suggestions related to preview features to the forum.

The key properties of a preview feature are:

  • High-quality implementation
  • No guarantee of future existence, compatibility, or security confidence.

Note A preview feature/support is subject to change in the future. It may be removed or radically altered in future releases. Changes to a preview feature do NOT require the usual deprecation and deletion process. Using a preview feature in a production code base is therefore strongly discouraged.

Known Issues

Jira ID

Description

Component

Workaround

#1 A number of issues were not addressed yet, see the Known Issues section in the Release Notes for Intel® Distribution of OpenVINO™ toolkit v.2020 All N/A
21670 FC layers with bimodal weights distribution are not quantized accurately by the Intel® GNA Plugin when 8-bit quantization is specified. Weights with values near to zero are set to zero. IE GNA plugin For now, use 16-bit weights in these cases.
25358 Some performance degradations are possible in the GPU plugin on GT3e/GT4e/ICL NUC platforms. IE GPU Plugin N/A
24709 Retrained TensorFlow Object Detection API RFCN model has significant accuracy degradation. Only the pretrained model produces correct inference results. All Use Faster-RCNN models instead of an RFCN model if retraining of a model is required.

26388

Low latency (batch size 1) graphs with LSTMCell do not infer properly due to missing state handling. All Use deprecated IRv7 and manually insert memory layers into the IR graph. Alternatively, add state tensors as extra input and output nodes and associate their blobs given the IR node IDs after loading the graph.
24101 Performance and memory consumption may be bad if layers are not 64-bytes aligned. IE GNA plugin Try to avoid the layers which are not 64-bytes aligned to make a model GNA-friendly.
28259 Slow BERT inference in the Python interface. IE Python This is only visible when importing PyTorch. Do not import the PyTorch module.
35367 [IE][TF2] Several models failed on the last tensor check with FP32. IE MKL-DNN Plugin  
39060 LoadNetwork crash on CentOS 7 with a large number of models. IE MKL-DNN Plugin  
34087 [cIDNN] Performance degradation on several models due to upgrade of the OpenCL driver clDNN  
33132 [IE CLDNN] Accuracy and last-tensor checks regressions for FP32 models on ICLU GPU IE clDNN Plugin  
25358 [cIDNN] Performance degradation on NUC and ICE_LAKE targets on R4 IE clDNN Plugin N/A
39136 Calling LoadNetwork after a failed reshape throws an exception IE NG integration  
42203

Customers from China may experience some issues with downloading content from the new storage https://storage.openvinotoolkit.org/ due to the China firewall

OMZ Please use a branch https://github.com/openvinotoolkit/open_model_zoo/tree/release-01org with links to old storage download.01.org
24757 The heterogeneous mode does not work for GNA IE GNA Plugin Split the model to run unsupported layers on CPU
48177 Can't import IE Python API with python3.8 Python API Please use the wheel package on PyPI (https://pypi.org/project/openvino

Included in This Release

The Intel® Distribution of OpenVINO™ toolkit is available in these versions:

  • OpenVINO™ toolkit for Windows*
  • OpenVINO™ toolkit for Linux*
  • OpenVINO™ toolkit for macOS*
Component License Location Windows Linux macOS

Deep Learning Model Optimizer

Model optimization tool for your trained models

Apache 2.0 <install_root>/deployment_tools/model_optimizer/* YES YES YES

Deep Learning Inference Engine

Unified API to integrate the inference with application logic

Inference Engine Headers

EULA

 

 

Apache 2.0

<install_root>/deployment_tools/inference_engine/*

 

 

<install_root>/deployment_tools/inference_engine/include/*

YES YES YES

OpenCV* library

OpenCV Community version compiled for Intel® hardware

Apache 2.0 <install_root>/opencv/* YES YES YES

Intel® Media SDK libraries (open source version)

Eases the integration between the OpenVINO™ toolkit and the Intel® Media SDK.

MIT <install_root>/../mediasdk/* NO YES NO

OpenVINO™ toolkit documentation

Developer guides and other documentation

  Available from the OpenVINO™ toolkit product site, not part of the installer packages. NO NO NO

Open Model Zoo

Documentation for models from the Intel® Open Model Zoo. Use the Model Downloader to download models in a binary format.

Apache 2.0 <install_root>/deployment_tools/open_model_zoo/* YES YES YES

Inference Engine Samples

Samples that illustrate Inference Engine API usage and demos that demonstrate how you can use features of Intel® Distribution of OpenVINO™ toolkit in your application

Apache 2.0

<install_root>/deployment_tools/inference_engine/samples/* YES YES YES

Deep Learning Workbench

Enables you to run deep learning models through the OpenVINO™ Model Optimizer, convert models into INT8, fine-tune them, run inference, and measure accuracy.

EULA Starting with the Intel® Distribution of OpenVINO™ toolkit 2021.3 release, DL Workbench is available only as a prebuilt Docker image. Reference to DL Workbench is kept in OpenVINO installation, but now pulls pre-built image from DockerHub instead of building it from the package.  YES YES YES

Post-Training Optimization Toolkit

Designed to convert a model into a more hardware-friendly representation by applying specific methods that do not require retraining, for example, post-training quantization.

EULA <install_root>/deployment_tools/tools/post_training_optimization_toolkit/* YES YES YES

Speech Libraries and End-to-End Speech Demos

 

GNA Software License Agreement <install_root>/data_processing/audio/speech_recognition/* YES YES NO
DL Streamer EULA <install_root>/data_processing/dl_streamer/* NO YES NO

 

Where to Download this Release

System Requirements

Disclaimer: Certain hardware (including but not limited to GPU and GNA) requires installation of specific drivers to work correctly. Drivers might require updates to your operating system, including Linux kernel, please refer to the their documentation. Operating system updates should be handled by user, and are not part of OpenVINO installation.

Intel® CPU Processors

Hardware:

  • Intel® Atom* processor with Intel® SSE4.2 support
  • Intel® Pentium® processor N4200/5, N3350/5, N3450/5 with Intel® HD Graphics
  • 6th - 11th generation Intel® Core™ processors
  • Intel® Xeon® processor Scalable Processors (formerly Skylake)
  • 2nd Generation Intel® Xeon® Scalable Processors (formerly Skylake and Cascade Lake)
  • 3rd Generation Intel® Xeon® Scalable Processors (formerly Cooper Lake  and Ice Lake)

Operating Systems:

  • Ubuntu* 18.04 long-term support (LTS), 64-bit
  • Ubuntu* 20.04 long-term support (LTS), 64-bit - preview support
  • Windows* 10, 64-bit
  • macOS* 10.15, 64-bit
  • CentOS* 7, 64-bit
  • For deployment scenarios on Red Hat* Enterprise Linux* 8.2 (64 bit), you can use the of Intel® Distribution of OpenVINO™ toolkit run-time package that includes the Inference Engine core libraries, nGraph, OpenCV, Python bindings, CPU and GPU plugins. The package is available as:

Intel® Processor Graphics

Hardware:

  • Intel® HD Graphics
  • Intel® UHD Graphics
  • Intel® Iris® Xe Graphics
  • Intel® Iris® Xe Max Graphics 
  • Intel® Iris® Pro Graphics

Operating Systems:

  • Ubuntu* 18.04 long-term support (LTS), 64-bit
  • Windows* 10, 64-bit
  • Yocto* 3.0, 64-bit
  • For deployment scenarios on Red Hat* Enterprise Linux* 8.2 (64 bit), you can use the of Intel® Distribution of OpenVINO™ toolkit run-time package that includes the Inference Engine core libraries, nGraph, OpenCV, Python bindings, CPU and GPU plugins. The package is available as:

Note This installation requires drivers that are not included in the Intel Distribution of OpenVINO toolkit package

Note A chipset that supports processor graphics is required for Intel® Xeon® processors. Processor graphics are not included in all processors. See Product Specifications for information about your processor.

Intel® Gaussian & Neural Accelerator (Intel® GNA)

Operating Systems:

  • Ubuntu* 18.04 long-term support (LTS), 64-bit
  • Windows* 10, 64-bit

Intel® VPU Processors

Intel® Vision Accelerator Design with Intel® Movidius™ Vision Processing Units (VPU)

Operating Systems:

  • Ubuntu* 18.04 long-term support (LTS), 64-bit (Linux Kernel 5.2 and below)
  • Windows* 10, 64-bit
  • CentOS* 7.6, 64-bit

Intel® Movidius™ Neural Compute Stick and Intel® Neural Compute Stick 2

Operating Systems:

  • Ubuntu* 18.04 long-term support (LTS), 64-bit
  • CentOS* 7.6, 64-bit
  • Windows* 10, 64-bit
  • Raspbian* (target only)

AI Edge Computing Board with Intel® Movidius™ Myriad™ X C0 VPU, MYDX x 1

Operating Systems:

  • Windows* 10, 64-bit

Components Used in Validation

Operating systems used in validation:

DL frameworks used for validation:

  • TensorFlow 1.15.2, 2.2.0 (limited support according to product features)
  • MxNet 1.5.1

Note: Version of CMake specified above is to build OpenVINO from source.  Building samples and demos from the Intel® Distribution of OpenVINO™ toolkit package requires CMake* 3.10 or higher (except of Windows where CMake 3.14 is required as the first supporting Visual Studio 2019).

Helpful Links

Note Links open in a new window.

 

Legal Information

You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps.

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at http://www.intel.com/ or from the OEM or retailer.

No computer system can be absolutely secure.

Intel, Arria, Core, Movidius, Xeon, OpenVINO, and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos

*Other names and brands may be claimed as the property of others.

Copyright © 2021, Intel Corporation. All rights reserved.

For more complete information about compiler optimizations, see our Optimization Notice.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.