GPU Analysis Report

Intel® VTune™ Amplifier provides several ways to view GPU Analysis data collected from the command line.

  • The primary mode to view cpugpu-concurrency and gpu-hotspots analysis data is the standalone GUI.
  • For OpenCL™ applications running on Windows targets, you can use the gpu-computing-tasks command line report to view computing tasks or their instances annotated with total and average time, instance count and average values of GPU hardware metrics (if applicable).

Examples

Example 1: Report per OpenCL Kernels

This example shows how to view the collected data per OpenCL kernels submitted and executed on Intel® HD Graphics and Intel® Iris® Graphics using the gpu-computing-tasks report type:

$ amplxe-cl -report gpu-computing-tasks -result-dir r010ah
Computing Task (GPU)     Global Size  Local Size  SIMD Width  Total Time  Average Time  Instance Count
-----------------------  -----------  ----------  ----------  ----------  ------------  --------------
BitonicSort              4194304      [Unknown]   16              3.435s        0.012s             276
BitonicSort              8388608      [Unknown]   16              0.330s        0.014s              24
clEnqueueMapBuffer       [Unknown]    [Unknown]   [Unknown]       0.000s        0.000s               1
clEnqueueUnmapMemObject  [Unknown]    [Unknown]   [Unknown]       0.000s        0.000s               1

Example 2: Report Grouped per OpenCL Kernels

This example filters and groups the collected data by OpenCL kernel instances:

$ amplxe-cl -report gpu-computing-tasks -result-dir r010ah -group-by=computing-instance

Computing Task (GPU)     Instance  Global Size  Local Size  SIMD Width  Total Time  Average Time  Instance Count
-----------------------  --------  -----------  ----------  ----------  ----------  ------------  --------------
BitonicSort              1         8388608      [Unknown]   16              0.045s        0.045s               1
BitonicSort              3         8388608      [Unknown]   16              0.017s        0.017s               1
BitonicSort              2         4194304      [Unknown]   16              0.016s        0.016s               1
BitonicSort              54        4194304      [Unknown]   16              0.015s        0.015s               1
BitonicSort              152       4194304      [Unknown]   16              0.014s        0.014s               1
BitonicSort              104       4194304      [Unknown]   16              0.014s        0.014s               1
BitonicSort              103       4194304      [Unknown]   16              0.014s        0.014s               1
BitonicSort              296       4194304      [Unknown]   16              0.014s        0.014s               1
BitonicSort              5         4194304      [Unknown]   16              0.014s        0.014s               1
BitonicSort              248       4194304      [Unknown]   16              0.014s        0.014s               1
BitonicSort              201       4194304      [Unknown]   16              0.014s        0.014s               1
BitonicSort              202       4194304      [Unknown]   16              0.014s        0.014s               1
BitonicSort              249       4194304      [Unknown]   16              0.014s        0.014s               1
...

For more complete information about compiler optimizations, see our Optimization Notice.
Select sticky button color: 
Orange (only for download buttons)