Documentation

  • 2021.2
  • 06/30/2021
  • Public Content

Run
OpenVINO™
Benchmarking Tool

This tutorial describes how to run the benchmark application on an 11th Generation Intel® Core™ processor with integrated GPU. Using the benchmark application, it is possible to estimate deep learning inference engine performance. For the scope of this exercise, only asynchronous mode is used to measure performance and latency.

Start Docker Container

Run the command below to start the Docker container as root:
docker run -it --env="USER=root" --env="DISPLAY" --env="QT_X11_NO_MITSHM=1" --privileged --volume /dev:/dev --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" amr-ubuntu2004-full-flavour-sdk:<TAG>

Set Environment Variables

The environment variables must be set before you can compile and run
OpenVINO™
applications.
Run the following script:
source /opt/intel/openvino/bin/setupvars.sh --or-- source <OPENVINO_INSTALL_DIR>/bin/setupvars.sh

Build Benchmark Application

  1. Change directory and build the benchmark application using the
    cmake
    script file using the following commands:
    cd /opt/intel/openvino/inference_engine/samples/cpp ./build_samples.sh
  2. Once the build is successful, access the benchmark application in the following directory:
    cd /root/inference_engine_cpp_samples_build/intel64/Release -- or -- cd <INSTALL_DIR>/inference_engine_cpp_samples_build/intel64/Release
    The
    benchmark_app
    application will be available inside the Release folder.

Download Model File

To run the benchmark application, download the desired Intel pre-trained models using the Model Downloader tool.
  1. Enter the command:
    cd /opt/intel/openvino/deployment_tools/open_model_zoo/tools/downloader
    The syntax of the download model script is:
    python download.py --name <model_name> # :To download a single model file
  2. This tutorial uses the
    mobilenet-ssd
    model in FP16 data format. Download this model using the command:
    # Export proxy if needed. python3 downloader.py --name mobilenet-ssd
  3. The
    downloader.py
    script downloads the
    mobilenet-ssd
    model and the
    converter.py
    script converts the model to an Inference Engine format that can be used by
    OpenVINO™
    :
    python3 converter.py --name mobilenet-ssd --mo /opt/intel/openvino/deployment_tools/model_optimizer/mo.py -- or -- sudo python3 converter.py --name mobilenet-ssd --mo /opt/intel/openvino/deployment_tools/model_optimizer/mo.py [ SUCCESS ] Generated IR version 10 model. [ SUCCESS ] XML file: /opt/intel/openvino_2021.2.200/deployment_tools/open_model_zoo/tools/downloader/public/mobilenet-ssd/FP32/mobilenet-ssd.xml [ SUCCESS ] BIN file: /opt/intel/openvino_2021.2.200/deployment_tools/open_model_zoo/tools/downloader/public/mobilenet-ssd/FP32/mobilenet-ssd.bin [ SUCCESS ] Total execution time: 6.30 seconds. [ SUCCESS ] Memory consumed: 380 MB.

Input File

Select an image file or a sample video file to provide an input to the benchmark application from the following directory:
cd /root/inference_engine_cpp_samples_build/intel64/Release

Application Syntax and Options

The benchmark application syntax is as follows:
./benchmark_app [OPTION]
In this tutorial, we recommend you select the following options:
./benchmark_app -m <model> -i <input> -d <device> -nireq <num_reqs> -nthreads <num_threads> -b <batch> where: <model>-------------The complete path to the model .xml file <input>-------------The path to the folder containing image or sample video file. <device>------------The device type can be GPU or CPU etc., <num_reqs>----------No of parallel inference requests <num_threads>-------No of threads to use for inference on the CPU (throughput mode) <batch>-------------Batch size
For complete details on the available options, run the following command:
./benchmark_app -h

Run the Application

The benchmark application is executed as seen below. This tutorial uses the following settings:
  • Benchmark application is executed on mobilenet-ssd model in FP16 data format.
  • Number of parallel inference requests is set as 8.
  • Number of CPU threads to use for inference is set as 8.
  • Device type is GPU.
./benchmark_app -d GPU -i ~/<dir>/input/ -m ~/<dir>/mobilenet-ssd/FP16/mobilenet-ssd.xml -nireq 8 -nthreads 8 # Copy plates_720.mp4 video in the docker, run: curl -o /data_samples/media_samples/plates_720.mp4 --proxy "http://proxy-dmz.intel.com:911" http://glaic3n002.gl.intel.com/plates_720.mp4 plates_720.mp4 downloaded with success. ./benchmark_app -d GPU -i /data_samples/media_samples/plates_720.mp4 -m /opt/intel/openvino/deployment_tools/open_model_zoo/tools/downloader/public/mobilenet-ssd/FP16/mobilenet-ssd.xml -nireq 8 -nthreads 8
Expected output:
[Step 1/11] Parsing and validating input arguments [ INFO ] Parsing input parameters [ INFO ] Files were added: 1 [ INFO ] /data_samples/media_samples/plates_720.mp4 [Step 2/11] Loading Inference Engine [ INFO ] InferenceEngine: API version ............ 2.1 Build .................. 2021.2.0-1877-176bdf51370-releases/2021/2 Description ....... API [ INFO ] Device info: GPU clDNNPlugin version ......... 2.1 Build ........... 2021.2.0-1877-176bdf51370-releases/2021/2 [Step 3/11] Setting device configuration [ WARNING ] -nstreams default value is determined automatically for GPU device. Although the automatic selection usually provides a reasonable performance,but it still may be non-optimal for some cases, for more information look at README. [Step 4/11] Reading network files [ INFO ] Loading network files [ INFO ] Read network took 24.97 ms [Step 5/11] Resizing network to match image sizes and given batch [ INFO ] Network batch size: 1 [Step 6/11] Configuring input of the model [Step 7/11] Loading the model to the device [ INFO ] Load network took 11719.39 ms [Step 8/11] Setting optimal runtime parameters [Step 9/11] Creating infer requests and filling input blobs with images [ INFO ] Network input 'data' precision U8, dimensions (NCHW): 1 3 300 300 [ WARNING ] No supported image inputs found! Please check your file extensions: bmp, dib, jpeg, jpg, jpe, jp2, png, pbm, pgm, ppm, sr, ras, tiff, tif [ INFO ] Infer Request 0 filling [ INFO ] Fill input 'data' with random values (image is expected) [ INFO ] Infer Request 1 filling [ INFO ] Fill input 'data' with random values (image is expected) [ INFO ] Infer Request 2 filling [ INFO ] Fill input 'data' with random values (image is expected) [ INFO ] Infer Request 3 filling [ INFO ] Fill input 'data' with random values (image is expected) [ INFO ] Infer Request 4 filling [ INFO ] Fill input 'data' with random values (image is expected) [ INFO ] Infer Request 5 filling [ INFO ] Fill input 'data' with random values (image is expected) [ INFO ] Infer Request 6 filling [ INFO ] Fill input 'data' with random values (image is expected) [ INFO ] Infer Request 7 filling [ INFO ] Fill input 'data' with random values (image is expected) [Step 10/11] Measuring performance (Start inference asynchronously, 8 inference requests using 2 streams for GPU, limits: 60000 ms duration) [ INFO ] First inference took 8.39 ms [Step 11/11] Dumping statistics report Count: 10840 iterations Duration: 60077.15 ms Latency: 44.37 ms Throughput: 180.43 FPS

Benchmark Report

Sample execution results using an 11th Gen Intel® Core™ i7-1185G7E @ 2.80 GHz.
Read network time (ms)
24.97
Load network time (ms)
11719.39
First inference time (ms)
8.39
Total execution time (ms)
60077.15
Total num of iterations
10840
Latency (ms)
44.37
Throughput (FPS)
180.43
Note:
Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. No product or component can be absolutely secure. Performance varies by use, configuration and other factors. Learn more at Intel Performance Index.

Summary and Next Steps

In this tutorial, you learned how to run the
OpenVINO™
benchmarking tool.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.