Developer Guide

Contents

Invoke the Profiler Runtime Wrapper to Obtain Profiling Data

After compiling your DPC++ program using the
Intel® oneAPI
DPC++/C++
Compiler
, you can profile your FPGA design using the Profiler Runtime Wrapper. The Profiler Runtime Wrapper calls your executable and collects profile information at a given sample rate. The performance counter data is saved in a
profile.mon
monitor description file that the Profiler Runtime Wrapper post-processes and outputs into a readable
profile.json
file. You are encouraged to use the
profile.json
for further data processing instead of the
profile.mon
file. However, both are available for use after host execution completes.
To invoke the Profiler Runtime Wrapper, execute the following command:
aocl profile [options] /path/to/executable [executable options]
where:
  • [options]
    are any additional flags you want to pass to the wrapper. Refer to
    aocl profile –help
    for a list of options and their uses.
  • /path/to/executable
    is the path to the executable generated by the compiler.
  • [executable options]
    are any options or arguments that need to be passed along to the executable.
Because of slow network disk accesses, running the host application from a networked directory might introduce delays between kernel executions. These delays might increase the overall execution time of the host application. In addition, they might introduce delays during kernel executions while the runtime stores profile output data to disk.

Split the Execution and Data Post-Processing

By default, the Profiler Runtime Wrapper automatically runs a post-processing step on your
profile.mon
monitor file to produce a readable
profile.json
file. In some situations, the post-processing step may take longer than expected. Because of this, you can choose to separate the execution and data post-processing steps into two separate manual steps. To do this, use the
--no-json
and
--no-run <path to profile.mon file>
Profiler Runtime Wrapper options.
  • The
    --no-json
    flag only runs your executable and produces a
    profile.mon
    monitor file without post-processing it.
  • The
    --no-run <path to profile.mon file>
    flag does not invoke your executable and instead just calls the post-processing step on the supplied
    profile.mon
    file.

Temporal Performance Collection

During the run of your host application, the Profiler collects performance counter data at a given sample rate
n
. After
n
cycles, the Profiler collects the performance counter data and outputs it to the
profile.mon
monitor file.
  • You can control the rate at which the Profiler counters are sampled by setting the Profiler Runtime Wrapper's
    -period
    flag. The specified period is the minimum number of kernel pipeline clock cycles between profiling samples. If you do not set a period, the default behavior is to profile as often as possible.
    For particularly large or long-running designs, the amount of data generated by the default temporal period might result in very large
    profile.mon
    and
    profile.json
    files. To reduce this file size, increase the sampling period or turn off temporal profiling.
  • To turn off temporal profiling and instead collect performance data only once a kernel has finished executing, you can set the Profiler Runtime Wrapper's
    -no-temporal
    flag.
    If you collect the performance data only at the end of execution, the data is an average representation of the kernel's overall execution.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.