User Guide

Contents

Run
CPU / Memory Roofline Insights
Perspective from Command Line

To plot a Roofline chart, the
Intel® Advisor
does the following:
  1. Collect OpenCL™ kernels timings and memory data using the Survey analysis with GPU profiling.
  2. Measure the hardware limitations and collect floating-point and integer operations data using the Characterization analysis with GPU profiling.
    Intel® Advisor
    calculates compute operations (FLOP and INTOP) as a weighted sum of the following groups of instructions: BASIC COMPUTE, FMA, BIT, DIV, POW, MATH
    Intel Advisor
    automatically determines data type in the collected operations using the
    dst
    register.
For convenience,
Intel Advisor
has the shortcut
--collect=roofline
command line action, which you can use to run both Survey and Characterization analyses with a single command. This shortcut command is recommended to run the
CPU / Memory Roofline Insights
perspective, but it does not support MPI applications. To analyze an MPI application, run the
--collect=survey
and
--collect=tripcounts
commands one by one.

Prerequisites

Set
Intel Advisor
environment variables
with an automated script to enable the
advisor
command line interface (CLI).
In the commands below, the options in square brackets (
[--
<option>
]
) are recommended if you want to change what data is collected.

Plot a CPU Roofline Chart

  1. Run the Roofline analysis for CPU with one of the following methods:
    • Using the shortcut command line action:
      advisor --collect=roofline --project-dir=
      <project-dir>
      [--stacks] [--enable-cache-simulation] --
      <target-application>
      [
      <target-options>
      ]
    • Using two separate commands:
      advisor --collect=survey --project-dir=
      <project-dir>
      --
      <target-application>
      [
      <target-options>
      ]
      advisor --collect=tripcounts --flop [--stacks] [--enable-cache-simulation] --
      <target-application>
      [
      <target-options>
      ]
      Use this method to analyze an MPI application. See Analyze MPI Workloads for details.
      where:
      • --stacks
        is an option to enable advanced collection of call stack data. Use this option to generate a CPU Roofline chart with call stacks to extend the basic model with
        total
        data capability. The total data includes data from the loop/function itself and its inner loops/functions.
      • --enable-cache-simulation
        is an option to model multiple levels of cache and evaluate the data transfers between the different memory layers available on your system. Use this option to generate Memory-Level CPU Roofline chart.
      Without these two options,
      Intel Advisor
      generates a basic CPU Roofline chart based on the Cache-Aware Roofline Model (CARM).
  2. Optional
    : Check memory access patterns to get a detailed information about memory usage. Run the Memory Access Patterns analysis for the marked loops:
    advisor --collect=map --project-dir=
    <project-dir>
    [--enable-cache-simulation] --select=
    <criteria>
    --
    <target-application>
    [
    <target-options>
    ]
    where:
    • --enable-cache-simulation
      is an option to model accurate memory footprints, miss information, and cache line utilization. Use this option for the Memory Access Patterns analysis if you used this option for the Roofline.
    • --select=
      <string>
      is an option to select loops for the analysis by loop IDs, source locations, criteria such as
      scalar
      ,
      has-issue
      , or
      markup=
      <markup-mode>
      . For example, use
      --select=has-issue
      to analyze loops that have the
      Possible Inefficient Memory Access Pattern
      issue.
      For more information about markup options, see Loop Markup to Minimize Overhead.
    This analysis does not add more information to the CPU Roofline chart. The results are added to the Refinement report, which you can view from GUI or from CLI. Use it to understand the Memory-Level Roofline chart better and get more detailed optimization recommendations.
Example
Collect data for the Memory-Level CPU Roofline chart with call stacks:
advisor --collect=roofline --project-dir=./advi --stacks --enable-cache-simulation -– myApplication

View the Results

Intel Advisor
provides several ways to work with the
CPU / Memory Roofline Insights
results.
View Results in GUI
When you run
Intel Advisor
CLI, a project is created automatically in the directory specified with
--project-dir
. All the collected results and analysis configurations are stored in the
.advixeproj
project, which you can view in the
Intel Advisor
.
To open the project in GUI, run the following command:
advisor-gui <project-dir>
If the report does not open, click
Show Result
on the Welcome pane.
You will see the CPU Roofline report that includes:
  • Roofline chart that plots an application's achieved performance and arithmetic intensity against the CPU maximum achievable performance
  • Additional information about your application in the
    Advanced View
    pane under the chart, including source code, detailed code analytics for trip counts and FLOP/INTOP data, optimization recommendations, and compiler diagnostics
    Select a dot on the Roofline chart to see details for the selected loop in all tabs of the
    Advanced View
    pane
CPU Roofline report
View an Interactive HTML Report
To generate an interactive HTML report for the CPU Roofline chart from CLI, run the following command:
advisor --report=roofline --project-dir=
<project-dir>
--report-output=
<path>
[--with-stack] [--data-type=
<type>
] [--memory-level=
<string>
]
where:
  • --report-output=
    <path>
    is a path and a name for an HTML file to save the report to. For example,
    /home/roofline.html
    . This option is required to generate an HTML report.
  • --with-stack
    is an option to enable call stack data in the HTML report. Use it if you generated the CPU Roofline results with call stack data using the
    --stacks
    option.
  • --data-type=
    <type>
    is a specific type of data to show in the HTML report. Available types are
    float
    (default),
    int
    ,
    mixed
    . You
    cannot
    change the data type after the report in generated.
  • --memory-level=
    <string>
    is a specific memory level(s) to show in the HTML report by default. Available memory levels are
    L1
    (default),
    L2
    ,
    L3
    , and
    DRAM
    . You can combine several memory levels with an underscore (for example,
    L1_L2
    )
When you open the report, you see the CPU Roofline chart with the selected configuration. In this report, you can:
  • Expand the
    Performance Metrics Summary
    drop-down to view the summary performance characteristics for your application.
  • Double-click a dot on the chart to see a roof ruler that point to exact roofs that bound the dot.
  • Hover over a dot to see a detailed tooltip with performance metrics.
If you have a Memory-level Roofline report, you can also:
  • Select memory levels to show dots for from the filter drop-down list on the chart.
  • Double-click a dot on the chart to expand it for other memory levels and see roof rulers.
CPU Roofline HTML report
Save a Read-only Snapshot
A snapshot is a read-only copy of a project result, which you can view at any time using the
Intel Advisor
GUI. To save an active project result as a read-only snapshot:
advisor --snapshot --project-dir=
<project-dir>
[--cache-sources] [--cache-binaries] --
<snapshot-path>
where:
  • --cache-sources
    is an option to add application source code to the snapshot.
  • --cache-binaries
    is an option to add application binaries to the snapshot.
  • <snapshot-path
    is a path and a name for the snapshot. For example, if you specify
    /tmp/new_snapshot
    , a snapshot is saved in a
    tmp
    directory as
    new_snapshot.advixeexpz
    . You can skip this and save the snapshot to a current directory as
    snapshot
    XXX
    .advixeexpz
    .
To open the result snapshot in the
Intel Advisor
GUI, you can run the following command:
advisor-gui
<snapshot-path>
You can visually compare the saved snapshot against the current active result or other snapshot results.

Next Steps

These sections are GUI-focused, but you can still use them to understand the results. For details about the metrics reported, see CPU Metrics.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.