Intel® Xeon Phi™ Coprocessor (Code Name: Knights Corner) Analysis Workflow

Note

This type of analysis is supported for Intel® Parallel studio XE only.

The following figure shows basic steps required to analyze an application running on Intel® Xeon Phi™ coprocessors based on Intel Many Integrated Core (Intel® MIC architecture) architecture or perform a system-wide analysis. You may choose to run one of the predefined analysis types, Hotspots, Memory Bandwidth, General Exploration, or create a custom analysis type. To display more information about a workflow step:

  • Position (hover) the mouse pointer to display a brief explanation.
  • Click to display the associated topic.

Prerequisites: Build the target on the host with full optimizations, which is recommended for performance analysis.

1.

Install the sampling driver

Install the sampling server and driver on an Intel Xeon Phi coprocessor card to be sampled.

3.

Specify and configure your analysis target from the host system

  • For native application analysis, copy the binary to the Intel Xeon Phi coprocessor. For offload applications, no copying is required. To communicate with the Intel Xeon Phi coprocessor cards, you may use any of the following mechanisms:

    • Mount an NFS share. See the NFS Mounting a Host Export topic in the Intel Manycore Platform Software Stack (MPSS) help for details.

    • Use existing SSH tools.

  • In the Project Properties: Target, specify an analysis target, which can be the system or an application on an Intel Xeon Phi coprocessor card.

  • Symbol resolution for Intel Xeon Phi coprocessor modules is done on the host during collection post-processing. For proper symbol resolution, you need to specify search paths for Intel Xeon Phi coprocessor modules in the Project Properties: Binary/Symbol Search and Source Search tabs. You can also specify search paths after collection, but, if you do that, the result should be re-resolved to get the symbol information from the binaries after the symbol paths have been established.

4.

Configure and run an analysis type

  • To enable performance analysis for offload applications, set the environment variable AMPLXE_COI_DEBUG_SUPPORT=TRUE. By default, it is set as FALSE to reduce the overhead of running offload applications.

  • From the performance analysis tree in the Analysis Type window, choose Knights Corner Platform Analysis and configure the required analysis type. Click Start to run the analysis.

5.

Open and interpret analysis results

Intel® VTune™ Amplifier generates a data collection result and, by default, opens it in the Hotspots viewpoint. Switch between available viewpoints to identify code regions that took most of the CPU time and experienced potentially significant architectural bottlenecks.

如需更全面地了解编译器优化,请参阅优化注意事项