Loop Markup to Minimize Analysis Overhead

Issue

Running your target application on all detected loops with the Intel Advisor can take substantially longer than running your target application without the Intel Advisor. For example:

Runtime Overhead / Analysis

Survey

Trip Counts & FLOP

Roofline

Dependencies

MAP

Target application runtime with Intel Advisor compared to runtime without Intel Advisor

1.1x longer

3 - 8x longer

3.1 - 8.1x longer

5 - 100x longer

5 - 20x longer

Solutions

Use the following techniques to skip uninteresting loops and analyze only interesting loops.

Minimization Technique

Impacted Intel Advisor Analyses

Summary

Select loops by ID

  • Roofline

  • Trip Counts & FLOP

  • Dependencies

  • Memory Access Patterns

GUI control: checkbox(es) on the Survey Report

CLI action option: -mark-up-list=<string>

Select loops by source file/line number

  • Roofline

  • Trip Counts & FLOP

  • Dependencies

  • Memory Access Patterns

GUI control: checkbox(es) on the Survey Report

CLI action: -mark-up-loops with action option -select=<string>

Select loops by criteria

  • Dependencies

  • Memory Access Patterns

GUI control: Workflow pane > Batch mode and settings

CLI: action -mark-up-loops or -collect with action option -loops=<string>

Select Loops by ID

Minimize collection overhead.

Applicable analyses: Roofline, Trip Counts and FLOP, Dependencies, Memory Access Patterns.

Use when...

  • You want to perform a deeper analysis on only a few loops.

  • CLI environment: You cannot identify source file/line numbers, such as when you are analyzing a target application for which you do not have access to source code.

Prerequisites:

  1. Run a Survey analysis.

  2. CLI environment: Identify the loop IDs for the loops of interest.

    advixe-cl -report survey -project-dir ./myAdvisorProj -- ./bin/myTargetApplication

Tip

Intel Advisor reports tend to be very wide. Do one of the following to generate readable reports:

  • Set your console width appropriately to avoid line wrapping.

  • Pipe your report using the appropriate truncation command if you care only about the first few report columns.

After performing the prerequisites, do one of the following:

  • Mark the loop(s) of interest by enabling the associated checkbox on the Survey Report.

    Then run a Trip Counts and FLOP, Dependencies, or Memory Access Patterns analysis.

  • Mark the loop(s) of interest using the CLI action option -mark-up-list=<string> when running a Trip Counts and FLOP, Dependencies, or Memory Access Patterns analysis. For example:

    advixe-cl -collect tripcounts -flop -project-dir ./myAdvisorProj -mark-up-list=5,10,12 -- ./bin/myTargetApplication

Note

There is essentially no difference between selecting loops by ID and selecting loops by source file/line in the GUI environment. The difference is in the CLI environment:

  • The CLI action option-mark-up-list=<string> merely simulates enabling a GUI checkbox; therefore it persists only for the duration of the -collect command.

  • The CLI action-mark-up-loops and action option -select=<string> actually enables a GUI checkbox; therefore it persists beyond the duration of the -mark-up-loops command and applies to downstream analyses, such as Roofline, Trip Counts and FLOP, Dependencies, and Memory Access Patterns.

Select Loops by Source File/Line Number

Minimize collection overhead.

Applicable analyses: Roofline, Trip Counts and FLOP, Dependencies, Memory Access Patterns.

Use when...

  • You want to perform a deeper analysis on only a few loops.

  • CLI environment: You are analyzing a target application for which you have access to source code and can identify source file/line numbers.

Prerequisites:

  1. Run a Survey analysis.

  2. CLI environment: If necessary, identify the source file and line number for the loops of interest.

    advixe-cl -report survey -project-dir ./myAdvisorProj -- ./bin/myTargetApplication

After performing the prerequisites, do one of the following:

  • Mark the loop(s) of interest by enabling the associated checkbox on the Survey Report.

    Then run a Trip Counts and FLOP, Dependencies, or Memory Access Patterns analysis.

  • Mark the loop(s) of interest using the CLI action -mark-up-loops and action option -select=<string>. For example:

    advixe-cl -mark-up-loops -select=foo.cpp:34,bar.cpp:192 -project-dir ./myAdvisorProj -- ./bin/myTargetApplication

    Then run a Trip Counts and FLOP, Dependencies, or Memory Access Patterns analysis.

Note

  • You can also append and remove source file/line numbers using the CLI action option -append=<string> and -remove=<string> respectively.

  • There is essentially no difference between selecting loops by ID and selecting loops by source file/line in the GUI environment. The difference is in the CLI environment:

    • The CLI action option-mark-up-list=<string> merely simulates enabling a GUI checkbox; therefore it persists only for the duration of the -collect command.

    • The CLI action-mark-up-loops and action option -select=<string> actually enables a GUI checkbox; therefore it persists beyond the duration of the -mark-up-loops command and applies to downstream analyses, such as Roofline, Trip Counts and FLOP, Dependencies, and Memory Access Patterns.

Select Loops by Criteria

Minimize collection overhead.

Applicable analyses: Dependencies, Memory Access Patterns.

Use when you want to perform a deeper analysis on loops chosen by criteria instead of by human input, such as when you are running the Intel Advisor in batch mode or using automated scripts.

To implement in the GUI environment:

  1. Toggle on the Batch mode control at the top of the Workflow pane.

  2. Enable the Dependencies and/or Memory Access Patterns checkboxes.

  3. Choose Automatic Selection and enable the appropriate criteria checkbox(es):

    • Dependencies analysis :

      • Scalar serial loops only (CLI corollary = scalar)

      • Innermost loops only (CLI corollary = loop-height=N; set to 0 for innermost loops only)

      • Above .1% of total CPU time only (CLI corollary = total-time>N, but you can specify percentage)

      • With "Assumed Dependency Present" issue only (CLI corollary = has-issue)

      • Exclude loops without source location (CLI corollary = has-source)

      • Top 10 loops with the biggest Self Time

    • Memory Access Patterns analysis

      • With "Possible Inefficient Memory Access Pattern" issue only (CLI corollary = has-issue)

      • Above .1% of total CPU time only (CLI corollary = total-time>N, but you can specify percentage)

      • Exclude loops without source location (CLI corollary = has-source)

      • Loop height (CLI corollary = loop-height=N; where innermost loops have loop-height=0)

      • Top 10 loops with the biggest Self Time

  4. Click the Intel Advisor control: Resume collectionCollect control.

To implement in the CLI environment, create a script similar to the following examples, which produce the same outcome:

  • Example 1:

    advixe-cl -collect survey -project-dir ./myAdvisorProj -- ./bin/myTargetApplication
    advixe-cl -mark-up-loops -loops="scalar,has-issue" -project-dir ./myAdvisorProj -- ./bin/myTargetApplication
    advixe-cl -collect dependencies -project-dir ./myAdvisorProj  -- ./bin/myTargetApplicaton
    
  • Example 2:

    advixe-cl -collect survey -project-dir ./myAdvisorProj -- ./bin/myTargetApplication
    advixe-cl -collect dependencies -loops="scalar,has-issue" -project-dir ./myAdvisorProj  -- ./bin/myTargetApplicaton
    
For more complete information about compiler optimizations, see our Optimization Notice.