Developer Guide


Set Up the Intercept Layer for OpenCL* Applications

The Intercept Layer for OpenCL* Applications is available on GitHub* at
To set up the Intercept Layer for OpenCL Applications, perform the following steps:
  1. Download Intercept Layer for OpenCL Applications version 2.2.1 or later from GitHub* at the following URL:
  2. Build the Intercept Layer according to the instructions provided in How to Build the Intercept Layer for OpenCL* Applications.
  3. Ensure that you have set
    when running
    command. For example, run
    cmake -DENABLE_CLILOADER=1 ..
  4. Run the
    command in the build directory. This step builds the
    loader utility.
    executable should now exist in the
    <path to opencl-intercept-layer-master download>/<build dir>/cliloader/
  5. Add the directory to your
    environment variable if you want to run multiple designs using
    You can now pass your executables to
    to run them with the intercept layer. For details about the
    loader utility, see cliloader: A Intercept Layer for OpenCL* Applications Loader.
  6. Set
    and other Intercept Layer options.
    If you run multiple designs with the same options, set up a
    file in your home directory. You can also set the options as environment variables by prefixing the option name with
    . For example, the
    option can be set through the
    environment variable. For a list of options, see
    in How to Use the Intercept Layer for OpenCL Applications.
    The intercept layer must know where
    file from the original oneAPI build is.
    These options print out runtime timeline information in the output of the executable run.
    These variables set up the chrome tracer output and ensure the output has Queued, Submitted, and Execution stages.
These instructions set up the
executable, which provides some flexibility by allowing for more control over when the layer is used or not used. If you prefer a local installation (for a single design) or a global installation (always ON for all designs), follow the instructions at How to Install the Intercept Layer for OpenCL Applications.
When you run the host executable with
cliloader <executable> [executable args]
command, the
output contains lines as shown in the following example:
Device Timeline for clEnqueueWriteBuffer (enqueue 1) = 63267241140401 ns (queued), 63267241149579 ns (submit), 63267241194205 ns (start), 63267242905519 ns (end)
These lines give the timeline information about a variety of oneAPI runtime calls. After the host executable finishes running, there is also a summary of the performance information for the run. After the executable runs, the data collected is placed in the
directory, which is in the home directory by default. Its location can be adjusted using the
DumpDir=<directory where you want the output files> cliloader
option. The
directory contains a file called
. You can load this JSON file in the Google* Chrome trace event profiling tool (
) to visualize the timeline data collected by the run.
The following is a sample visualization of timeline data:
OpenCL Intercept Layer Full Example Trace
OpenCL Intercept Layer Full Example Trace
This visualization shows different calls executed through time. The X-axis is time, with the scale shown near the top of the page. The Y-axis shows different calls that are split up in several ways.
The left side (Y-axis) has two different types of numbers:
  • Numbers that contain a decimal point.
    • The part of the number before the decimal point orders the calls approximately by start time.
    • The part of the number after the decimal point represents the queue number the call was made in.
  • Numbers that do not contain a decimal point. These numbers represent the thread ID of the thread being run on in the operating system.
The colors in the trace represent different stages of execution:
  • Blue during the queued stage.
  • Yellow during the submitted stage.
  • Orange for the execution stage.
Identify gaps between consecutive execution stages and kernel runs to identify possible areas for optimization.
For an example use of Intercept Layer for OpenCL Applications, see Applying Double-Buffering Using the Intercept Layer for OpenCL* Applications.

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804