User Guide


CPU / Memory Roofline Insights

Visualize actual performance against hardware-imposed performance ceilings by running the
CPU / Memory Roofline Insights
perspective. It helps you determine the main limiting factor (memory bandwidth or compute capacity) and provides an ideal roadmap of potential optimization steps.

CPU Roofline Report

Example of a CPU Roofline report
After you execute the
CPU / Memory Roofline Insights
perspective, you get a
CPU Roofline
report of collected results.
  • The
    chart plots an application's
    achieved performance
    arithmetic intensity
    against the machine's
    maximum achievable performance
    • Arithmetic intensity (x axis) - measured in number of floating-point operations (FLOPs) and/or integer operations (INTOPs) per byte, based on the loop/function algorithm, transferred between CPU/VPU and memory
    • Performance (y axis) - measured in billions of floating-point operations per second (GFLOPS) and/or billions of integer operations per second (GINTOPS)
  • In the
    Code Analytics
    tab, see a focused Roofline chart for a selected loop/function with more details about its performance and limitations.
  • In the
    tab, see the Roofline Conclusions for a selected loop/function with recommended optimizations based on a dot position.

How It Works

CPU / Memory Roofline Insights
perspective includes the following steps:
  • Collect loop/function timings using the
  • Collect floating-point and/or integer operations data, memory traffic data, and measure the hardware limitations of your machine using the
    analysis in the
    This collection can take three to four times longer than the Survey analysis.

Perspective Views

  • Analysis Workflow
    pane - Review the controls available to configure the perspective workflow for your application.
  • CPU Roofline
    chart - Review the controls available to help you focus on the performance data most important to you.
  • Refinement
    reports - Review the controls available to help you investigate the dependencies and memory issues of your application.

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at