User Guide

Contents

Loop Markup to Minimize Analysis Overhead

Issue

Running your target application with the
Intel® Advisor
can take substantially longer than running your target application without the
Intel® Advisor
.
Depending on an accuracy level and analyses you choose for a perspective, different overhead is added to your application execution time.
For example:
Runtime Overhead / Analysis
Survey
Characterization
Dependencies
MAP
Target application runtime with
Intel® Advisor
compared to runtime without
Intel® Advisor
1.1x longer
2 - 55x longer
5 - 100x longer
5 - 20x longer

Solutions

Use the following techniques to skip
uninteresting
loops and analyze only
interesting
loops.

Select Loops by ID

Goal: Minimize collection overhead.
Applicable analyses:
Characterization with Trip Counts and FLOP collection enabled
, Dependencies, Memory Access Patterns.
Use when...
  • You want to perform a deeper analysis on only a few loops.
  • CLI environment: You cannot identify source file/line numbers, such as when you are analyzing a target application for which you do not have access to source code.
Prerequisites:
  1. Run a Survey analysis.
  2. advisor
    CLI environment: Identify the loop IDs for the loops of interest.
    advisor --report=survey --project-dir=./myAdvisorProj -- ./bin/myTargetApplication
    In the report, the first column is the loop IDs.
Intel® Advisor
reports tend to be very wide. Do one of the following to generate readable reports:
  • Set your console width appropriately to avoid line wrapping.
  • Pipe your report using the appropriate truncation command if you care only about the first few report columns.
After performing the prerequisites, do one of the following:
  • For Vectorization and CPU Roofline: Mark the loop(s) of interest by enabling the associated checkbox on the
    Survey Report
    .
    Then run a
    Characterization with Trip Counts and FLOP collection enabled
    , Dependencies, or Memory Access Patterns analysis.
  • For Offload Modeling: Go to
    Project Properties
    Performance Modeling
    and enter the CLI action option in the
    Other parameters
    field. For example,
    --select=5,10,12
    .
  • Mark the loop(s) of interest using the CLI action option
    --select=<string>
    (recommended) or when running a
    Characterization with Trip Counts and FLOP collection enabled
    , Dependencies, or Memory Access Patterns analysis. For example, with the
    --select
    option:
    advisor --collect=tripcounts --flop --project-dir=./myAdvisorProj --select=5,10,12 -- ./bin/myTargetApplication
    Then run a Characterization with Trip Counts and FLOP collections enabled, Dependencies, or Memory Access Patterns analysis.
There are different ways to select loops is in the CLI environment:
  • The
    advisor
    CLI action options
    --mark-up-list=<string>
    and
    --select=<string>
    merely simulate enabling a GUI checkbox when used within -collect action. They are active only for the duration of the
    --collect
    command.
  • The same options used with
    advisor
    CLI action
    --mark-up-loops
    actually enable a GUI checkbox. They are active beyond the duration of the
    -mark-up-loops
    command and applies to all downstream analyses, such as
    Characterization with Trip Counts and FLOP collection enabled
    , Dependencies, Memory Access Patterns.

Select Loops by Source File/Line Number

Minimize collection overhead.
Applicable analyses:
Characterization with Trip Counts and FLOP collection enabled
, Dependencies, Memory Access Patterns.
Use when...
  • You want to perform a deeper analysis on only a few loops.
  • CLI environment: You are analyzing a target application for which you have access to source code and can identify source file/line numbers.
Prerequisites:
  1. Run a Survey analysis.
  2. advisor
    CLI environment: If necessary, identify the source file and line number for the loops of interest.
    advisor --report=survey --project-dir=./myAdvisorProj -- ./bin/myTargetApplication
After performing the prerequisites, do one of the following:
  • For Vectorization and CPU Roofline: Mark the loop(s) of interest by enabling the associated checkbox on the Survey report.
    Then run a Characterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis.
  • For Offload Modeling: Go to
    Project Properties
    Performance Modeling
    and enter the CLI action option in the
    Other parameters
    field. For example,
    --select=foo.cpp:34,bar.cpp:192
    .
  • Mark the loop(s) of interest using the CLI action option
    --select=<string>
    (recommended) or
    --mark-up-list=<string>
    for a Characterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis. For example, with the -select option:
    advisor --collect=tripcounts --flop --project-dir=./myAdvisorProj -- select=foo.cpp:34,bar.cpp:192 -- ./bin/myTargetApplication
  • Mark the loop(s) of interest by enabling the associated checkbox on the
    Survey Report
    .
    Then run a
    Characterization with Trip Counts and FLOP collection enabled
    , Dependencies, or Memory Access Patterns analysis.
  • Mark the loop(s) of interest using the
    advisor
    CLI action
    --mark-up-loops
    and action option
    --select=<string>
    . For example:
    advisor --mark-up-loops --select=foo.cpp:34,bar.cpp:192 --project-dir=./myAdvisorProj -- ./bin/myTargetApplication
    Then run a
    Characterization with Trip Counts and FLOP collection enabled
    , Dependencies, or Memory Access Patterns analysis.
  • There is essentially no difference between selecting loops by ID and selecting loops by source file/line in the GUI environment. The difference is in the
    advisor
    CLI environment:
    • The
      advisor
      CLI action option
      --mark-up-list=<string>
      merely simulates enabling a GUI checkbox; therefore it persists only for the duration of the
      --collect
      command.
    • The
      advisor
      CLI action
      --mark-up-loops
      and action option
      --select=<string>
      actually enables a GUI checkbox; therefore it persists beyond the duration of the
      --mark-up-loops
      command and applies to downstream analyses, such as
      Characterization with Trip Counts and FLOP collection enabled
      , Dependencies, and Memory Access Patterns.
  • If you use the
    --mark-up-loops
    CLI action to mark up loops, you can append and remove source file/line numbers for an analysis run after it using the
    advisor
    CLI action option
    --append=<string>
    and
    --remove=<string>
    respectively.

Select Loops by Criteria

Goal: Minimize collection overhead.
Applicable analyses: Dependencies, Memory Access Patterns.
Use when you want to perform a deeper analysis on loops chosen by criteria instead of by human input, such as when you are running the
Intel® Advisor
in batch mode or using automated scripts.
To implement in the
advisor
CLI environment, run the commands similar to the following one by one from the command line or create a script similar to the following examples and run it to execute the commands automatically. Use the
--select
(recommended) or
--loops
option to select loops by criteria.
For example, to analyze loop-carried dependencies in loops/functions that have the
Assumes dependency present
issue, use one of the following:
  • Example 1:
    advisor --collect=survey --project-dir=./myAdvisorProj -- ./bin/myTargetApplication
    advisor --mark-up-loops --select="scalar,has-issue" --project-dir=./myAdvisorProj -- ./bin/myTargetApplication
    advisor --collect=dependencies --project-dir=./myAdvisorProj -- ./bin/myTargetApplicaton
  • Example 2:
    advisor --collect=survey --project-dir=./myAdvisorProj -- ./bin/myTargetApplication
    advisor --collect=dependencies select="scalar,has-issue" --project-dir=./myAdvisorProj -- ./bin/myTargetApplicaton

Select Loops by Markup Algorithm

Goal: Minimize collection overhead.
Applicable analyses: Characterization with Trip Counts and FLOP collection enabled, Dependencies, Memory Access Patterns.
This is only applicable to the
Offload Modeling
perspective.
Use
--select=r:markup=<algorithm>
when you want to perform a deeper analysis on loops chosen by a pre-defined markup algorithm based on a programming model used and/or estimated offload profitability.
If you analyze an application that runs on a CPU, use the
gpu_generic
algorithm. This algorithm selects all potentially profitable loops/functions for additional analyses to collect more data and make sure they can be safely offloaded.
If you analyze code regions that are already offloaded and use a specific programming model, use one of the following algorithms:
  • omp
    - Select OpenMP* loops.
  • dpcpp
    - Select Data Parallel C++ loops.
  • ocl
    - Select OpenCL™ loops.
  • daal
    - Select Intel® oneAPI Data Analytics Library loops.
  • tbb
    - Select Intel® oneAPI Threading Building Blocks loops.
For example, to run the
Offload Modeling
and analyze potentially profitable code regions in details:
  • Example 1:
    advisor --collect=survey --project-dir=./myAdvisorProj --stackwalk-mode=online --static-instruction-mix -- ./bin/myTargetApplication
    advisor --collect=tripcounts --project-dir=./myAdvisorProj --flop --enable-cache-simulation --target-device=gen11_icl --stacks --data-transfer=light -- ./bin/myTargetApplication
    advisor --mark-up-loops --project-dir=./myAdvisorProj --select markup=gpu_generic -- ./bin/myTargetApplication
    advisor --collect=dependencies --filter-reductions --loop-call-count-limit=16 --project-dir=./myAdvisorProj -- ./bin/myTargetApplication advisor --collect=projection --project-dir=./myAdvisorProj
    advisor --collect=projection --project-dir=./myAdvisorProj
  • Example 2:
    advisor --collect=survey --project-dir=./myAdvisorProj --stackwalk-mode=online --static-instruction-mix -- ./bin/myTargetApplication
    advisor --collect=tripcounts --project-dir=./myAdvisorProj --flop --enable-cache-simulation --target-device=gen11_icl --stacks --data-transfer=light -- ./bin/myTargetApplication
    advisor --collect=dependencies --filter-reductions --loop-call-count-limit=16 --select markup=gpu_generic --project-dir=./myAdvisorProj -- ./bin/myTargetApplication
    advisor --collect=projection --project-dir=./myAdvisorProj
Currently, there is no GUI equivalent of the markup strategies. The
gpu_generic
strategy is used by default. If you want to change it, go to
Project Properties
Performance Modeling
and enter
--select=<string>
into the
Other parameters
field.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.