User Guide

  • 2020
  • 06/18/2020
  • Public Content
Contents

Vectorization Workflow Diagram

Follow these steps (white blocks are optional) to get started using the
Vectorization Advisor
in the
Intel Advisor
.
Vectorization Advisor workflow: Dig Deeper
  • Survey
    analysis
    -
    Offers integrated compiler report data and performance data all in one place. Use it to help identify:
    • Where vectorization, or parallelization with threads, will pay off the most
    • If vectorized loops are providing benefit, and if not, why not
    • Un-vectorized loops and why they are not vectorized
    • Performance problems in general
    The
    Survey Analysis
    also provides code-specific recommendations for how to fix vectorization issues, and quick visibility into source code and assembly code.
  • analysis (optional) -
    Dynamically identifies the number of times loops are invoked and execute (sometimes called
    call count/loop count
    and
    iteration count
    respectively); and measures the number of floating-point and integer operations, and memory traffic. Use to make better decisions about your vectorization strategy for particular loops, as well as optimize already-parallel loops.
  • Roofline
    analysis
    (optional) - Helps
    visualize actual performance against hardware-imposed performance ceilings, as well as determine the main limiting factor (memory bandwidth or compute capacity), thereby providing an ideal roadmap of potential optimization steps.
    Use the
    Roofline
    chart to answer the following questions:
    • What is the maximum achievable performance with your current hardware resources?
    • Does your application work optimally on current hardware resources?
    • If not, what are the best candidates for optimization?
    • Is memory bandwidth or compute capacity limiting performance for each optimization candidate?
  • Dependencies
    analysis
    (optional) -
    For safety purposes, the compiler is often conservative when assuming data dependencies. Use a Dependencies-focused
    Refinement Report
    to check for real data dependencies in loops the compiler did not vectorize because of assumed dependencies. If real dependencies are detected, the analysis can provide additional details to help resolve the dependencies. Your objective: Identify and better characterize real data dependencies that could make forced vectorization unsafe.
  • Memory Access Patterns (MAP)
    analysis
    (optional) -
    Use a MAP-focused
    Refinement Report
    to check for various memory issues, such as non-contiguous memory accesses and unit stride vs. non-unit stride accesses. Your objective: Eliminate issues that could lead to significant vector code execution slowdown or block automatic vectorization by the compiler.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804