Examine your program to measure the approximate predicted performance. You run the Suitability tool, which uses the Intel Advisor annotations to predict your program's approximate parallel performance. To do this:
- Build a target executable for the Suitability tool using a Release build.
- Run the Suitability tool.
- Adjust parameters for mathematical modeling.
- View the Scalability of Maximum Site Gain graph.
- Use the Suitability Source window and an editor.
- View the Summary window.
Build a Target for the Suitability Tool
This step was completed previously, using a release build that includes debug information and moderate optimization. Before proceeding, make sure that a release build is selected. See the Intel Advisor GUI.
Run the Suitability Tool
To run the Intel Advisor Suitability tool, do one of the following:
Click the Collect Suitability Data or the Start Paused button on the side command toolbar. To hide or show the command toolbar, click the or button in the upper-right of the Suitability Report.
Click the Collect Suitability Data button in the Advisor XE Workflow tab (below 3. Check Suitability).
In the Intel Advisor GUI, choose File > New > Start Suitability Analysis.
While the Suitability tool runs your program, or if the selected project contains no Suitability data, the Suitability Report command toolbar appears. For example, immediately after you click the Collect Suitability Data button:
Command output appears while the Suitability tool runs your program to predict its parallel performance characteristics. You can redirect application non-GUI output to the Application Output pane (shown above) instead of the command window using the Options dialog box.
After the Suitability tool finalizes the data, data appears in the result tab in the Suitability Report window:
In the upper part of the window, the All Sites pane shows there was one parallel site annotation pair executed in source file nqueens_annotated.f90 (second column) with the label solve (first column). The column Maximum Site Gain indicates the approximate predicted gain for the single parallel site of 12.63x based on the modeling parameters.
On your system, use the drop-down list to set the Target CPU Count to 16.
The values displayed on your screen for Maximum Site Gain and Maximum Program Gain For All Sites: will be different than the numbers shown in this tutorial.
The value after Maximum Program Gain For All Sites: indicates the possible maximum performance for all sites. In this case, the value 12.59x for the Target CPU Number of 16 indicates a good run-time gain.
Depending on the vertical screen size available, the Selected Site pane may not display both the scalability graph and the annotation grid. In this case, click the displayed link to toggle between displaying the scalability graph or the annotation grid.
Adjust Parameters for Mathematical Modeling
In the All Sites pane, you can change the values for the Target CPU Number and Threading Model items to see how much changing the values influences the Maximum Site Gain value for this site. For example, change the Target CPU Number from 2 to 32 and view the difference in the estimated Maximum Site Gain value, which is shown above as 12.63x for the original Target CPU Number of 16. Similarly, the Selected Site pane in the lower part of the window lists five items that can also influence the Maximum Site Gain value for this site.
Select the Threading Model that you intend to use when you add parallelism. This selection does slightly alter the calculations based on the relative overhead expected. For this Fortran sample, select the Threading Model as OpenMP
A Maximum Site Gain value of less than 1.0 indicates a decrease in performance. If this occurs with your program, consider moving or removing the annotations for that site.
Viewing Task Characteristics
The Selected Site pane shows information about the tasks and locks for the selected parallel site. For example, use this information with your own program to help you decide where to add task annotations, especially if it contains nested loops. Under the Number of Instances column, the instances of the Task were 14, which is the default data set board size for this sample with a release build.
Look at the Average Instance Time value for the task in the Selected Site pane. The Average Instance Time should be large enough to overcome the overhead of starting and ending a task. The Enable Task Chunking in the lower pane is useful when the task's Average Instance Time is lower than task start/stop overhead time and should be considered when values are less than 0.01 second, and always used with lower values, such as 0.00001 second. If your Fortran program has small tasks, you can use the task chunking feature with the OpenMP* parallel framework.
Similarly, if a task is an innermost loop whose computation time is small and that loop occurs within a nested loop, consider adding task annotations around the next outermost loop.
Viewing the Scalability of Maximum Site Gain Graph
In the Selected Site pane, a graph summarizes the Scalability of Maximum Site Gain for the selected site. The number of cores appears on the X axis and the program's run-time performance gain appears on the Y axis. In this case, notice that because the data set size is 14, the addition of more cores does not help speed-up beyond 14 cores, because there are only 14 tasks. Near the top of vertical lines for each CPU number, you may see a box and a circle that indicate the minimum and maximum predicted gain values. The circles for this site appear in the green shaded area and indicate good results. If the minimum-maximum range appears in the yellow-shaded area, you should investigate how the results can be improved.
To the right of the graph is a table under Changes I will make to this site to improve performance that lists items related to parallel overhead, task chunking, and lock contention. Under the Recommended column, if you see the word Yes, there is a benefit for that item. In this case, you should select the check box to indicate that you agree to take the appropriate action later when you change annotations into parallel framework code. The graph will change to reflect the modified modeling parameters.
Using the Suitability Source Window and an Editor
To view the sources associated with a task or lock, double-click (or right-click and select View Source) a line to display the Suitability Source window.
After you determine a correct location, double-click a source line in the Suitability Source window to launch the code editor with that source file opened to the corresponding location. Within the Intel Advisor GUI, click File > Options > Editor to choose the Linux* editor displayed for each source language.
To return from Suitability Source to the corresponding Suitability Report window, click the Suitability Report button.
Viewing the Summary Window
To view the Summary window, click the Summary button in the result tab:
The Summary window provides a dashboard-like summary of data collected by Intel Advisor tools. It provides easy access to detailed data in the report windows, your sources, and collection details. It also lists a count of the source files and annotations found. For example, under Potential program gain, in the column Maximum Site Gain, click the number (shown above as 12.63x) to display the Suitability Report window.
In the example Summary window shown above, because the Correctness tool has not been run for this project, there is no data (indicated by ?) shown under the Correctness Problems column for the solve parallel site. As you run additional Intel Advisor tools, more data gets added to the Summary window.