After building the sample application and collecting baseline performance data for it, rerun it under the scrutiny of Intel® VTune™ Amplifier to discover what parts of the code are being most used. Advanced Hotspots analysis collects event and IP (Instruction Pointer) information to reveal evidence of a basic set of hardware issues induced by the application code that may be affecting its performance.
Prerequisites: You created a project and specified your sample application as an Intel Xeon Phi coprocessor (native) target in the Analysis Target tab.
VTune Amplifier automatically detects your target system configuration and displays analysis types applicable to the Intel® Xeon Phi™ coprocessor.
The Advanced Hotspots predefined configuration opens on the right.
The Intel Xeon Phi coprocessor (host launch) target type does not support call stack collection, so only the default Hotspots mode is available for the Advanced Hotspots analysis.
VTune Amplifier starts the
bat script that runs the
matrix.mic application on the Intel Xeon Phi coprocessor card. The application calculates a large matrix multiply before exiting. When the application exits or after a predefined interval, depending on how the collection run was configured, collection is completed and the VTune Amplifier enters its finalization process, where data are coalesced, symbols are reconnected to their addresses, and certain data are cached to speed the display of results.
To make sure the performance of the application is repeatable, go through the entire tuning process on the same system with a minimal amount of other software executing.