This article explains how to use VTune(tm) Amplifier XE 2013 in order in collect performance data on an OpenCL* application running on an Intel(R) Xeon Phi(TM) coprocessor.
Xeon Phi coprocessor driver setup:
- cd /opt/intel/vtune_amplifier_xe_2013/bin64/k1om
- This step is necessary for the JIT collection on the Xeon Phi coprocessor. The previous steps enable the sampling collection driver.
- service mpss restart
- You need to restart the system for the new drivers to be loaded.
In order for your OpenCL* kernel code to run on a Xeon Phi coprocessor you need to specify the OpenCL* device as an accelerator type.
err = clGetDeviceIDs(platform, CL_DEVICE_TYPE_ACCELERATOR, 1, &device_id, NULL)
You should then be able to compile your application. Using the Intel compiler:
icc –g –L/opt/intel/opencl/lib64 –lintelocl –lcl_logger
Running an analysis in the VTune Amplifier XE GUI
- Source /opt/intel/vtune_amplifier_xe_2013/amplxe-vars.sh
- amplxe-gui &
- Create a project File->New->
- This will bring up the “Project Properties” dialog.
- Specify your OpenCL* binary as the application to launch,
- You should also specify some additional search libraries.
- “Search Directories” tab
- Search directories for: “All"
- Run a Lightweight-hotspots collection
- Click on the “New Analysis” button
- This will bring up the New Analysis dialog
- Scroll down to Knights Corner Platform -> Lightweight hotspots
- Click start
- Your application will launch and performance data will be collected on the Xeon Phi coprocessor.