Configuring GPU Analysis

For application using Processor Graphics, configure the Intel® VTune™ Amplifier provides to monitor, analyze, and correlate the application performance on both the CPU and GPU.

To enable the GPU analysis:

  1. Click the (standalone GUI)/ (Visual Studio IDE) New Analysis button on the VTune Amplifier toolbar.

    The Analysis Type configuration window opens.

  2. From the analysis tree on the right, choose the required analysis type.

    The right pane is updated with the configuration options for the selected analysis type. For GPU analysis, the VTune Amplifier provides three options: Analyze DirectX pipeline events, Analyze Processor Graphics hardware events, and Trace OpenCL kernels on Processor Graphics.

  3. To analyze GPU task scheduling and identify whether your application is CPU or GPU bound, select the Analyze DirectX pipeline events option.
  4. For Intel HD Graphics: To monitor the Render and GPGPU engine usage, identify which parts of the engine are loaded, correlate GPU and CPU data, select a predefined event set from the Analyze Processor Graphics hardware events drop-down menu.

    VTune Amplifier provides two presets of the hardware metrics. Both presets collect data about execution units (EUs) activity: EU Array Active, EU Array Stalled, EU Array Idle, Computing Threads Started, and Core Frequency.

    • Overview event set also includes metrics that track general GPU memory accesses such as Memory Read/Write Bandwidth, GPU L3 Misses, Sampler Busy, Sampler Is Bottleneck, and GPU Memory Texture Read Bandwidth. These metrics can be useful for both graphics and compute-intensive applications.

    • Global/local accesses event group also includes metrics that distinguish accessing different types of data on a GPU: Untyped Memory Read/Write Bandwidth, Typed Memory Read/Write Transactions, and SLM Read/Write Bandwidth. This metrics are useful for compute-intensive workloads on the GPU.

  5. For OpenCL™ kernels running on Intel HD Graphics: To know OpenCL kernels execution time, monitor performance of each kernel per GPU metrics and identify hotspot kernels, select the Trace OpenCL kernels on Processor Graphics option.

When collection and post-processing is complete and the result is open, click to the Graphics tab to see details of GPU activity, also correlated with CPU processes and threads. For GPU metrics description, hover over the column name in the grid or right-click and select the What's This Column? context menu option.


Similar to running graphics applications with hardware acceleration, you cannot run GPU data collection via a Remote Desktop connection. To run the GPU data collection, run the VTune Amplifier from the target computer's console or access the computer via VNC. To monitor general GPU busyness over time, run the VTune Amplifier as an Administrator.

For more complete information about compiler optimizations, see our Optimization Notice.