Compare with Previous Result

You optimized your code to apply a loop interchange mechanism that gave you about 22 seconds of improvement in the application execution time. To understand whether you got rid of the hotspot and what kind of optimization you got per function, re-run the Hotspots analysis on the optimized code and compare results:

  1. Compare results before and after optimization.

  2. Identify the performance gain.

Compare Results Before and After Optimization

  1. From the File menu select New > Knights Corner Platform - Hotspots Analysis.

    VTune Amplifier reruns Hotspots analysis for the updated matrix target and creates a new result (for example, r002ah) that opens automatically.

  2. Click the Compare Results button on the Intel® VTune™ Amplifier toolbar.

    The Compare Results window opens.

  3. Specify the Hotspots analysis results you want to compare and click the Compare Results button.

    The Summary window opens displaying application-level performance statistics for both results and their difference values.

Identify the Performance Gain

The Result Summary section of the Summary window shows difference information as follows: <Result 1 metric> – <Result 2 metric> = <metric Difference>.

You see that after optimization all metrics values have reduced significantly, though CPI Rate is still an issue (>1).

Switch to the Bottom-up window to view the CPU time usage per function for each result and their differences side by side.

Since for the second run you removed the multiply1 function, its time shows up in the Difference column as a performance gain.

Click the CPU Time:r002ah column to sort the data in the grid by this column.

The multiply2 function shows up on top as the biggest CPU Time hotspot for the result r002ah, though it performs much better than multiply1. You may try to optimize the code further using more advanced algorithms, for example, block-structuring access to matrix data to maximize cache reuse.

Para obter informações mais completas sobre otimizações do compilador, consulte nosso aviso de otimização.