User Guide


GPU Roofline Accuracy Presets

For each perspective,
Intel® Advisor
has several levels of collection accuracy. Each accuracy level is a set of analyses and properties that control what data is collected and the level of collection details. The higher accuracy value you choose, the higher runtime overhead is added.
The following accuracy levels are available:
Comparison / Accuracy Level
5 - 10x
15 - 50x
Analyze kernels in your application running on GPU
Analyze kernels running on GPU and loops/functions running on CPU in more details
Survey with GPU profiling + Characterization (FLOP)
Survey with GPU profiling + Characterization (Trip Counts and FLOP with call stacks for CPU and CPU cache simulation)
Result for kernels on GPU
Memory-level GPU Roofline (for CARM, L3, SLM, GTI)
Memory-level GPU Roofline (for CARM, L3, SLM, GTI)
Result for loops/functions on CPU
Cache-aware CPU Roofline for L1 cache
Memory-level Roofline with call stacks (for L1, L2, L3, DRAM)
You can choose custom accuracy and set a custom perspective flow for your application. For more information, see Customize
GPU Roofline Insights
There is a variety of techniques available to minimize data collection, result size, and execution overhead. Check Minimize Analysis Overhead .

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at