Example: Profile a DPC++ Matrix Application on Windows*

Use the Intel® VTune™ Profiler with a sample matrix_multiply DPC++ (Data Parallel C++) application to quickly get familiar with the product and statistics collected for GPU-bound applications.


For information on installing Intel VTune Profiler in the Microsoft* Visual Studio environment, see VTune Profiler User Guide.

Build the Matrix App

Download the code sample package for Intel oneAPI toolkits. This contains the matrix_multiply sample which you can use to build and profile a DPC++ application.

  1. Open Microsoft* Visual Studio.
  2. Click File > Open > Project/Solution. Find the matrix_multiply folder and select matrix_multiply.sln.
  3. Build this configuration (Project > Build).
  4. Run the program (Debug > Start Without Debugging).
  5. To choose a DPC++ or threaded version of the sample, use preprocessor definitions.

    1. Go to Project Properties > DPC++ > Preprocessor > Preprocessor Definition.
    2. Define DPCPP or USE_THR.

Run GPU Analysis

Run a GPU analysis on the Matrix sample.

  1. From the Visual Studio toolbar, click the Configure Analysis button.

    The Configure Analysis window opens. By default, it inherits your VS project settings and specifies the matrix_multiply.exe as an application to profile.

  2. In the Configure Analysis window, click the Browse button in the HOW pane.
  3. Under Platform Analysis, select the GPU Compute/Media Hotspots analysis type.

    GPU Analysis

  4. Click the Start button to launch the analysis with the predefined options.

VTune Profiler collects data and displays analysis results in the GPU Compute/Media Hotspots viewpoint. In the Summary window, see statistics on CPU and GPU resource usage to understand if your application is GPU-bound. Switch to the Graphics window to see basic CPU and GPU metrics representing code execution over time.

Optimization Notice: 
For more complete information about compiler optimizations, see our Optimization Notice.