GPU Roofline Insights Perspective
GPU Roofline Insights
Perspective Measure and visualize the actual performance of GPU kernels using benchmarks and hardware metric profiling against hardware-imposed performance ceilings, as well as determine the main limiting factor.
GPU Roofline Summary

After you execute the
GPU Roofline Insights
perspective, you first get a
Summary
report of collected results that includes the most important information about your application showing the performance metrics for kernels executed on GPU and loops/functions executed on CPU and giving you hints for next steps:
- View the main performance metrics of your program in theProgram Metricspane. This pane tells you how well your application uses the GPU resources ang how much space for improvement your application has.
- View the Roofline preview charts in theOP/S Bandwidthpane.
- View the execution time details on GPU- and CPU-executed parts of your code in thePerformance Characteristics. It can tell you how well your application uses the GPU resources.
- View top five time-consuming loops on GPU and on CPU sorted by self time with performance metrics in theTop Hotspotssection. You are recommended to start with the loops listed in this section when checking for performance issues.
When you review the
Summary
section, continue to
GPU Roofline Insights
Regions tab to examine your application performance in more detail.
For more information about metrics reported, see
Accelerator Metrics.
How It Works
The
GPU Roofline Insights
perspective includes the following steps:
- Collects OpenCL™ kernels timings and memory data using theSurvey analysiswith GPU profiling.
- Measures the hardware limitations and collects floating-point and integer operations data using theCharacterizationwith GPU profiling.
Perspective Views
- Analysis Workflowpane - Review the controls available to configure the perspective workflow for your application.
- GPU Roofline Summaryreport - Review a results summary that includes the most important information about your application performance.
- GPU Roofline Regionsreport - Review the chart controls available to help you focus on the performance data most important to you.
- Logs- Review the log messages reported during the perspective execution.