Profiling a DPC++ Application
- Tools:Intel® VTune™(Beta) - GPU Compute/Media Hotspots AnalysisProfiler
- All the Cookbook recipes are scalable and can be applied to Intel VTune Amplifier 2018 and higher. Slight version-specific configuration changes are possible.
- Intel® VTune™ Amplifier has been renamed to Intel® VTune™ Profiler starting with its version for Intel® oneAPI Base Toolkit (Beta). You can still use a standalone version of the VTune Profiler, or its versions integrated into Intel Parallel Studio XE or Intel System Studio.
- Intel Processor Graphics Gen9
- Intel microarchitecture code name Kaby Lake or Coffee Lake
- Operating system: Linux*. Run GPU target profiling on Linux kernel 4.14 or newer.
- Graphical User Interface:
- GTK+ (2.10 or higher. ideally, use 2.18 or higher)
- Pango (1.14 or higher)
- X.Org (1.0 or higher, ideally use 1.7 or higher)
Build and Compile a DPC++ Application
- Go to the sample directory.cd <sample_dir>/VtuneProfiler/matrix_multiply
- Themultiply.cppfile in thesrcdirectory contains several DPC++ versions of matrix multiplication. Select a version by editing the corresponding#define MULTIPLYline inmultiply.h.
- Compile your sample DPC++ application:cmake . makeThis generates amatrix.dpcppexecutable.To delete the program, type:make cleanThis removes the executable and object files that were created by themakecommand.
Run GPU Analysis on a DPC++ Application
- Launch the VTune Profiler (Beta) and clickNew Projectfrom the Welcome page.TheCreate a Projectdialog box opens.
- Specify a project name and a location for your project and clickCreate Project.TheConfigure Analysiswindow opens.
- Make sure theLocal Hostis selected in theWHEREpane.
- In theWHATpane, make sure theLaunch Applicationtarget is selected and specify thematrix_multiplybinary as anApplicationto profile.
- In theHOWpane, selectGPU Compute/Media Hotspotsfrom thePlatform Analysisgroup.This is the least intrusive analysis for applications running on platforms with Intel Graphics as well as on other third-party GPUs supported byIntel® VTune™.Profiler
- For the initial GPU analysis, make sure you have the following default options enabled:
- Characterizationmode with anOverviewmetric preset selected;
- Trace GPU Programming APIscheck box selected.
- Click theStartbutton at the bottom to launch the analysis.To run the same configuration from the command line, enter:vtune -collect gpu-hotspots -- ./matrix.dpcpp
Analyze Collected Data
GPU Bound Applications
CPU Bound Applications
The GPU is busy for a majority of the profiling time.
The CPU is busy for a majority of the profiling time.
There are small idle gaps between busy intervals.
There are large idle gaps between busy intervals.
The GPU software queue is rarely reduced to zero.