Check Performance Implications

Examine your program to measure the approximate predicted performance. You run the Suitability tool, which uses the Intel Advisor annotations to predict your program's approximate parallel performance. To do this:

Build a Target for the Suitability Tool

This step was completed previously, using a release build that includes debug information and moderate optimization. Before proceeding, make sure that a release build is selected. See Microsoft Visual Studio IDE or the Intel Advisor GUI.

Run the Suitability Tool

To run the Intel Advisor Suitability tool, do one of the following:

  • Click the Start the Survey toolCollect Suitability Data or the Start Paused button on the side command toolbar. To hide or show the command toolbar, click the Hide side command toolbar or Show side command toolbar button in the upper-right of the Suitability Report.

  • Click the Start the Survey toolCollect Suitability Data button in the Advisor XE Workflow tab (below 3. Check Suitability).

  • In the Visual Studio, click the Start the Survey tool icon in the Tools > Intel Advisor XE 2013 menu or the Intel Advisor toolbar.

While the Suitability tool runs your program, or if the selected project contains no Suitability data, the Suitability Report command toolbar appears. For example, immediately after you click the Collect Suitability Data button:


Suitability Report during tool analysis

Command output appears while the Suitability tool runs your program to predict its parallel performance characteristics. You can redirect application non-GUI output to the Application Output pane (shown above) instead of the command window using the Options dialog box.

After the Suitability tool finalizes the data, data appears in the result tab (shown below as 2_nqueens_annotated - ennn) in the Suitability Report window:


Suitability Report window

In the upper part of the window, the All Sites pane shows there was one parallel site annotation pair executed in source file nqueens_annotated.cpp (second column) with the label solve (first column). The column Maximum Site Gain indicates the approximate predicted gain for the single parallel site of 12.65x based on the modeling parameters.

On your system, use the drop-down list to set the Target CPU Count to 16.

Note

The values displayed on your screen for Maximum Site Gain and Maximum Program Gain for All Sites: will be different than the numbers shown in this tutorial.

The value after Maximum Program Gain for All Sites: indicates the possible maximum performance for all sites. In this case, the value 12.38x for the Target CPU Number of 16 indicates a good run-time gain.

Note

Target CPU Number

Depending on the vertical screen size available, the Selected Site pane may not display both the scalability graph and the annotation grid. In this case, click the displayed link to toggle between displaying the scalability graph or the annotation grid.

Adjust Parameters for Mathematical Modeling

In the All Sites pane, you can change the values for the Target CPU Number and Threading Model items to see how much changing the values influences the Maximum Site Gain value for this site. For example, change the Target CPU Number from 2 to 32 and view the difference in the estimated Maximum Site Gain value, which is shown above as 12.65 for the original Target CPU Number of 16. Similarly, the Selected Site pane in the lower part of the window lists five items that can also influence the Maximum Site Gain value for this site.

Select the Threading Model that you intend to use when you add parallelism. This selection does slightly alter the calculations based on the relative overhead expected.

A Maximum Site Gain value of less than 1.0 indicates a decrease in performance. If this occurs with your program, consider moving or removing the annotations for that site.

Viewing Task Characteristics

The Selected Site pane shows information about the tasks and locks for the selected parallel site. Under the Number of Instances column, the instances of the Task were 14, which is the default data set board size for this sample with a release build.

Look at the Average Instance Time value for the task in the Selected Site pane. The Average Instance Time should be large enough to overcome the overhead of starting and ending a task. The Enable Task Chunking in the lower pane is useful when the task's Average Instance Time is lower than task start/stop overhead time and should be considered when values are less than 0.01 second, and always used with lower values, such as 0.00001 second. If your program has small tasks, you can use the task chunking feature with the Intel® TBB, Intel® Cilk™ Plus, and OpenMP* parallel frameworks.

Similarly, if a task is an innermost loop whose computation time is small and that loop occurs within a nested loop, consider adding task annotations around the next outermost loop.

Viewing the Scalability of Maximum Site Gain Graph

In the Selected Site pane, a graph summarizes the Scalability of Maximum Site Gain for the selected site. The number of cores appears on the X axis and the program's run-time performance gain appears on the Y axis. In this case, notice that because the data set size is 14, the addition of more cores does not help speed-up beyond 14 cores, because there are only 14 tasks. Near the top of vertical lines for each CPU number, you may see a box and a circle that indicate the minimum and maximum predicted gain values. The circles for this site appear in the green shaded area and indicate good results. If the minimum-maximum range appears in the yellow-shaded area, you should investigate how the results can be improved.

To the right of the graph is a table under Changes I will make to this site to improve performance that lists items related to parallel overhead, task chunking, and lock contention. Under the Recommended column, if you see the word Yes, there is a benefit for that item. In this case, you should select the check box to indicate that you agree to take the appropriate action later when you change annotations into parallel framework code. The graph will change to reflect the modified modeling parameters.

Using the Suitability Source Window and an Editor

To view the sources associated with a task or lock, double-click (or right-click and select View Source) a line to display the Suitability Source window.

After you determine a correct location, double-click a source line in the Suitability Source window to launch the Visual Studio code editor with that source file opened to the corresponding location.

To return from Suitability Source to the corresponding Suitability Report window, click the Suitability Report button.

Viewing the Summary Window

To view the Summary window, click the Summary button in the result tab:


Summary window after running the Suitability tool

The Summary window provides a dashboard-like summary of data collected by Intel Advisor tools. It provides easy access to detailed data in the report windows, your sources, and collection details. It also lists a count of the source files and annotations found. For example, under Potential program gain, in the column Maximum Site Gain, click the number (shown above as 12.65x) to display the Suitability Report window.

In the example Summary window shown above, because the Correctness tool has not been run for this project, there is no data (indicated by ?) shown under the Correctness Problems column for the solve parallel site. As you run additional Intel Advisor tools, more data gets added to the Summary window.

Key Terms

parallel site, task

Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione