CPU / Memory Roofline Insights Perspective
- What is the maximum achievable performance with your current hardware resources?
- Does your application work optimally on current hardware resources?
- If not, what are the best candidates for optimization?
- Is memory bandwidth or compute capacity limiting performance for each optimization candidate?
How It Works
- Collect loop/function timings using theSurveyanalysis.
- Collect floating-point and/or integer operations data, memory traffic data, and measure the hardware limitations of your machine using theFLOPanalysis in theCharacterizationstep.This collection can take three to four times longer than the Survey analysis.
CPU Roofline Report
- Arithmetic intensity (x axis) - measured in number of floating-point operations (FLOPs) and/or integer operations (INTOPs) per byte, based on the loop/function algorithm, transferred between CPU/VPU and memory
- Performance (y axis) - measured in billions of floating-point operations per second (GFLOPS) and/or billions of integer operations per second (GINTOPS)