User Guide

Contents

Survey, Trip Counts, FLOPS, and Roofline Analyses

Survey Report Purpose and Usage

Run a Survey analysis to generate a
Survey Report
that offers integrated compiler report data and performance data for your target application all in one place. Optionally run a Trip Counts analysis and/or FLOP analysis to add data to the
Survey Report
. The Roofline analysis runs a Survey analysis followed by a FLOP analysis automatically.
  • Survey analysis - Identifies:
    • Where vectorization, or parallelization with threads, will pay off the most
    • If vectorized loops are providing benefit, and if not, why not
    • Un-vectorized loops and why they are not vectorized
    • Performance problems in general
  • Trip Counts analysis - Dynamically identifies the number of times loops and functions are invoked and executed (also called
    call count/loop count
    and
    iteration count
    respectively). Use Trip Counts data to:
    • Detect loops with too-small trip counts and trip counts that are not a multiple of vector length.
    • Analyze parallelism granularity more deeply.
  • FLOP analysis - Dynamically measures floating-point and integer operations, and memory traffic. Use the FLOP analysis to generate application memory usage and performance values that help you make better decisions about your vectorization strategy.
  • Roofline analysis - Helps you
    visualize actual performance against hardware-imposed performance ceilings, as well as determine the main limiting factor (memory bandwidth or compute capacity), thereby providing an ideal roadmap of potential optimization steps.
    Use the
    Roofline
    chart to answer the following questions:
    • What is the maximum achievable performance with your current hardware resources?
    • Does your application work optimally on current hardware resources?
    • If not, what are the best candidates for optimization?
    • Is memory bandwidth or compute capacity limiting performance for each optimization candidate?

Survey Report Regions

  • Filters pane -
    Filter analysis data by a variety of criteria, such as module, loop/function, and vectorized/non-vectorized.
  • Roofline Chart pane -
    visualize actual performance against hardware-imposed performance ceilings, as well as determine the main limiting factor (memory bandwidth or compute capacity), thereby providing an ideal roadmap of potential optimization steps.
  • Loop Information pane -
    View integrated compiler report data and Intel Advisor performance data for target application loops, and mark a loop for deeper analysis.
  • Advanced View pane -
    View more information for a loop selected in the Loop Information pane.
    • Source
      tab -
      View source code for a selected loop.
    • Top Down tab -
      View the function/loop hierarchy in a stack, and the source code associated with a specific function or loop. Each function or loop appears on a separate grid line. Loops are identified with an icon, the word
      [loop
      , followed by the source location and the function or procedure name that executes it.
    • Code Analytics tab -
      View the most important statistics for a selected loop.
    • Assembly
      tab -
      View assembly representation for a selected loop.
    • Recommendations tab -
      Explore code-specific recommendations for how to fix vectorization issues (Vectorization Advisor only).
    • Why No Vectorization?
      tab -
      View the reason automated vectorization failed (Vectorization Advisor only).
The associated Survey Source window, which you can use to view details about a code region, has the following panes:
  • View Activation pane -
    Enable or disable views shown in the Source view
  • Source View pane -
    View user-visible source code representation of the selected site.
  • Assembly View pane -
    View assembly representation of the selected site.
  • Call Stack View pane -
    View the call stack for the selected code region. Click to display related code regions in the
    File: filename
    pane, or click to display the context menu.
Do one of the following to access the
Survey Source
window:

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804