Minimizing Data Collection, Result Size, and Execution Time (Correctness)

For medium-large targets, several methods are available to minimize the amount of data collected and target execution time. Minimizing the data collected reduces the amount of data you need to examine in the Correctness Report; it also reduces the size of the generated result. To minimize execution time with the Correctness tool, consider reducing the data set processed by the application.

Using Pause Collection and Resume Collection Annotations to Minimize Data Collection

The Correctness tool recognizes all annotations, including:

  • Pause Collection: Stops data collection to skip uninteresting parts of the target program's execution. This minimizes the data collected, speeds up the analysis of medium-large applications, and minimizes execution time.

  • Resume Collection: Resumes data collection previously paused to collect data about the interesting parts of your program.

Place Pause Collection and Resume Collection annotations outside the parallel site code regions, which are defined by parallel site begin and parallel site end annotations.

Using Pause or Stop Buttons to Minimize Data Collection

Clicking the Pause button on the Correctness Report side command toolbar is equivalent to executing a Pause Collection annotation. Your code later needs to execute Resume Collection annotation.

Clicking the Stop button on the Correctness Report side command toolbar stops data collection, finalizes and displays the partially collected data. You might click this button if you already see many types of problems reported during collection and do not wish to wait for additional analysis to occur.

Reducing the Input Data Set with Adequate Code Coverage in Parallel Sites

When you run your program with the Correctness tool, it is very important that you choose appropriate input data for the specified parallel sites. There are two concerns:

  • Execution time: Running the Correctness tool is expensive. A program may take 50 to hundreds of times longer to run than it does normally. For example, if you run your program with an input data set that would normally take 25 minutes to process, the Correctness tool may take a day or more to run your program. The Correctness tool only collects data as it executes within the parallel sites. To minimize increased program run times, choose input data sets that minimize the number of instructions executed within a parallel site while thoroughly exercising your program's control flow paths. For example, the same data sharing problem will be detected whether you execute a loop once or a million times - but executing a loop a million times takes much longer.

  • Code coverage: It is even more important to choose input data that will cause most of the code within the parallel sites to be executed with the same sort of control flow as when it is processing real data. The Correctness tool is an excellent tool, but it is not magic. The only information that Intel Advisor has about your program comes from watching the data accessed by the code executed within parallel sites, including interleaved parallel operations within parallel tasks. It can find potential conflicts, but only between operations that are executed.

For example, consider this code:

int best_thing(thing *array, int size)
    int best = array[0];
    for (int i = 1; i < size; ++i) {
        if (better(array[i], array[best])) best = array[i];
    return best;

This code has a potential conflict between the write to best in one task and the write to best in any other task, and a potential conflict between the read of best in one task and the write to best in any other task. Intel Advisor will report these conflicts - if the write to best is executed.

But what will happen if, with the input data provided for the Correctness tool run, the first thing in the array is the best thing? In that case, there will be no writes to best executed in any iteration of the loop. Since the Correctness tool will not see any writes to best, it will not be able to report the potential conflicts.

So, you should choose input data for your Correctness run that will thoroughly exercise your program's control flow paths, but will not make your program run any longer than necessary.