User Guide


GPU Roofline Insights

Measure and visualize the actual performance of GPU kernels using benchmarks and hardware metric profiling against hardware-imposed performance ceilings, as well as determine the main limiting factor.

GPU Roofline Summary

Example of a GPU Roofline Summary report
After you execute the
GPU Roofline Insights
perspective, you first get a
report of collected results that includes the most important information about your application showing the performance metrics for kernels executed on GPU and loops/functions executed on CPU and giving you hints for next steps:
  • View the main performance metrics of your program in the
    Program Metrics
    pane. This pane tells you how well your application uses the GPU resources ang how much space for improvement your application has.
  • View the Roofline preview charts in the
    OP/S Bandwidth
  • View the execution time details on GPU- and CPU-executed parts of your code in the
    Performance Characteristics
    . It can tell you how well your application uses the GPU resources.
  • View top five time-consuming loops on GPU and on CPU sorted by self time with performance metrics in the
    Top Hotspots
    section. You are recommended to start with the loops listed in this section when checking for performance issues.
When you review the
section, continue to
GPU Roofline Insights
tab to examine your application performance in more detail.
For more information about metrics reported, see Accelerator Metrics.

How It Works

GPU Roofline Insights
perspective includes the following steps:
  • Collects OpenCL™ kernels timings and memory data using the
    Survey analysis
    with GPU profiling.
  • Measures the hardware limitations and collects floating-point and integer operations data using the
    with GPU profiling.

Perspective Views

  • Analysis Workflow
    pane - Review the controls available to configure the perspective workflow for your application.
  • GPU Roofline Summary
    report - Review a results summary that includes the most important information about your application performance.
  • GPU Roofline Regions
    report - Review the chart controls available to help you focus on the performance data most important to you.
  • Logs
    - Review the log messages reported during the perspective execution.

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at