User Guide

Contents

Execution Speed/Duration/Scope Properties to Minimize Analysis Overhead

Issue

Running your target application with the
Intel Advisor
can take substantially longer than running your target application without the
Intel Advisor
. For example:
Runtime Overhead / Analysis
Survey
Trip Counts & FLOP
Roofline
Dependencies
MAP
Target application runtime with
Intel Advisor
compared to runtime without
Intel Advisor
1.1x longer
3 - 8x longer
3.1 - 8.1x longer
5 - 100x longer
5 - 20x longer

Solutions

Use the following techniques to minimize overhead while collecting
Intel Advisor
analysis data. The
Disabling additional analysis
technique also minimizes finalization overhead.

Change Stackwalk Mode from Offline (After collection) to Online (During Collection)

Minimize collection overhead.
Applicable analysis: Survey.
Set to offline/after collection when:
  • Survey analysis runtime overhead exceeds 1.1x.
  • A large quantity of data is allocated on the stack, which is a common case for Fortran applications or applications with a large number of small, parallel, OpenMP* regions
To implement, do one of the following before/while running a Survey analysis:
  • Set
    Project Properties
    Analysis Target
    Survey Hotspots Analysis
    Advanced
    Stack unwinding mode
    During collection
    .
  • Use the
    advixe-cl
    CLI action option
    --stackwalk-mode=online
    . For example:
    advixe-cl --collect=survey --project-dir=./myAdvisorProj --stackwalk-mode=online -- ./bin/myTargetApplication

Disable Stacks Collection

Minimize collection overhead.
Applicable analyses:
Roofline,
Trip Counts and FLOP
.
To implement, do one of the following before/while running the analysis:
  • Disable the
    Vectorization Workflow
    pane
    > Enable Roofline with Callstacks
    checkbox.
  • Disable the
    Project Properties
    Analysis Target
    Trip Counts and FLOP Analysis
    Advanced
    Collect stacks
    checkbox.
  • Ensure the CLI action option
    --stacks
    is omitted from the command line. Alternative: Use the CLI action option
    -no-stacks
    .

Disable Stitch Stacks

Minimize collection overhead.
Applicable analysis: Survey.
The stitch stacks option restores a logical call tree for
Intel® Threading Building Blocks (Intel® TBB)
or OpenMP* applications by catching notifications from the runtime and attaching stacks to a point introducing a parallel workload.
Disable when Survey analysis runtime overhead exceeds 1.1x.
To implement, do one of the following before/while running the analysis:
  • Disable the
    Project Properties
    Analysis Target
    Survey Hotspots Analysis
    Advanced
    Stitch stacks
    checkbox.
  • Use the
    advixe-cl
    CLI action option
    --no-stack-stitching
    . For example:
    advixe-cl --collect=survey --project-dir=./myAdvisorProj --no-stack-stitching -- ./bin/myTargetApplication
Disabling stack stitching may decrease the overhead for applications using
Intel TBB
.

Increase Sampling Interval

Minimize collection overhead.
Applicable analysis: Survey.
Increase the wait time between each analysis collection sample when your target application runtime is long.
To implement, do one of the following before/while running the analysis:
  • Increase the value in the
    Project Properties
    Analysis Target
    Survey Hotspots Analysis
    Advanced
    Sampling interval
    checkbox.
  • Use the
    advixe-cl
    CLI action option
    --interval=<integer>
    when running a Survey analysis. For example:
    advixe-cl --collect=survey --project-dir=./myAdvisorProj --interval=20 -- ./bin/myTargetApplication

Limit Collected Analysis Data

Minimize collection overhead.
Applicable analysis: Survey.
Decrease the amount of collected raw data when exceeding a size threshold could cause issues. For example: You have storage space limitations.
To implement, do one of the following before/while running the analysis:
  • Decrease the value in the
    Project Properties
    Analysis Target
    Survey Hotspots Analysis
    Advanced
    Collection data limit, MB
    field.
  • Decrease the value in the
    advixe-cl
    CLI action option
    --data-limit=<integer>
    . For example:
    advixe-cl --collect=survey --project-dir=./myAdvisorProj --data-limit=250 -- ./bin/myTargetApplication

Limit Loop Call Count

Minimize collection overhead.
Applicable analysis: Dependencies, Memory Access Patterns.
Decrease the maximum number of instances each marked loop is analyzed.
To implement, do one of the following before/while running the analysis:
  • Supply a non-zero value in the
    Project Properties
    Analysis Target
    [Name] Analysis
    Advanced
    Loop Call Count Limit
    field.
  • Supply a non-zero value in the
    advixe-cl
    CLI action option
    --data-limit=<integer>
    . For example:
    advixe-cl --collect=dependencies --project-dir=./myAdvisorProj --loop-call-count-limit=10 -- ./bin/myTargetApplication

Disable Additional Analysis

Minimize finalization overhead.
Applicable analysis: Survey.
Implement these techniques when the additional data is not important to you.
The default setting for all the properties/options in the table below is disabled.
Path: Project Properties
Analysis Target
Survey Hotspots Analysis
Advanced
CLI Action Options
Description
Disable the
Analyze MKL loops and functions
checkbox.
--no-mkl-user-mode
Do not show Intel® Math Kernel Library (Intel® MKL) loops and functions in
Intel Advisor
reports.
Disable the
Analyze Python loops and functions
checkbox.
--no-profile-python
Do shot show Python* loops and functions in
Intel Advisor
reports.
Disable the
Analyze loops that reside in non-executed code paths
checkbox.
--no-support-multi-isa-binaries
Do not collect a variety of data for loops that reside in non-executed code paths, including:
  • Loop assembly code
  • Instruction set architecture (ISA)
  • Vector length
This capability is available only for binaries compiled using the
-ax
(Linux* OS)
/Qax
(Windows* OS) option with an Intel® compiler.
Disable the
Enable register spill/fill analysis
checkbox.
--no-spill-analysis
Do not calculate the number of consecutive load/store operations in registers and related memory traffic.
Disable the
Enable static instruction mix analysis
checkbox.
--no-static-instruction-mix
Do not statically calculate the number of specific instructions present in the binary.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804