Support for Intel® Cilk™ Plus in Intel® Parallel Inspector 2011

Intel® Cilk™ Plus is a simple and powerful abstraction for expressing parallelism. It is one of the Intel® Parallel Building Blocks and it is included in Intel® Parallel Composer 2011, which is part of Intel® Parallel Studio 2011.  In this initial introduction of Intel® Cilk™ Plus it is important to understand how the analysis features of Intel® Parallel Studio 2011 display results when Intel® Cilk™ Plus is used in your software. This article details the level of support provided by Intel® Parallel Inspector 2011.  Cilk Plus support should improve in future versions.

Overview
You can expect that Parallel Inspector 2011 will not crash while analyzing a project that includes Cilk Plus code. In general, either Memory or Threading analysis of a project with Cilk Plus code should generate a strict superset of the diagnostics that are generated without the Cilk keywords. This means that there may be some additional diagnostics reported for Cilk Plus code. Most of these will be “false positives” – where an issue is identified that is not really present (this would be due to interactions between the user code and Cilk Plus runtime library and incomplete instrumentation). Some of the additional diagnostics might be legitimate observations of behavior in the Cilk Plus runtime, but still not correctness issues (meaning, the Cilk Plus runtime has executed something by design that Parallel Inspector incorrectly categorized as a problem). There are a few exceptions to the superset assertion above, which are following in the appropriate sections.

Threading Analysis:
The results of Threading Analysis, at any level, on Cilk Plus code may result in false positives but should not result in false negatives. In other words, Parallel Inspector may report problems that are not really issues, but it should not fail to find legitimate problems. These “false positives” may be found in user code or in the Cilk Plus runtime. The figure below shows the results of running Threading Analysis for the simple count primes program with one cilk_for loop. These results show both real data races (P1, P2, and P4, which were introduced for the purpose of this example) and a false positive data race related to the Cilk Plus runtime (P3).


Figure8.PNG
Threading Analysis Results for Count Primes program with data races

If you are using Threading Analysis with a project containing Cilk Plus code and find a diagnostic you determine is a false positive, you can use the suppression feature to ensure you don’t see it again. If it is found in the Cilk Plus runtime, you can either suppress it, or if you wish you can report it to the Parallel Inspector support team by using http://premier.intel.com.
There is one case where Threading Analysis may fail to find legitimate data races in your Cilk Plus code. Thread Checker will not find races between spawned children and parents except when the parent is actually stolen. In a Cilk Plus cilk_for or cilk_spawn, the runtime is given permission to run work on other threads. It uses heuristics to decide whether or not to do so. For example, if a loop containing a cilk_for is able to be run quickly on one thread, the Cilk Plus runtime may not actually spawn work on other available threads. In this case, if the loop contained a data race, Parallel Inspector threading analysis would not find it. Parallel Inspector detects races between threads, and so will find races between different tasks only if those tasks are mapped to different threads. You are only likely to find a potential data race if you run your program with a variety of different data sets that cause a variety of different mappings of tasks to threads.

Memory Analysis:
As with Threading Analysis, Parallel Inspector Memory Analysis could report false positives in either the user’s code or the Cilk Plus runtime. These false positives should be handled in the same way recommended above. There is one additional item to be aware of when running Memory Analysis on Cilk Plus code, which is that you should not use the highest level of memory analysis. Using the highest level of memory checking also enables a stack access analysis, which currently may result in incorrect operation of the Cilk Plus application and data collection failure. Even without this feature enabled, Parallel Inspector may sometimes report incorrect call stacks for Cilk Plus code.

Summary and Where to Go for Help
Cilk Plus provides the user with the ability to easily serialize their code (that is, have the compiler ignore the Cilk Plus keywords) using the option –Qcilk-serialize. This option can be used to help you sort out false positives due to Cilk Plus code, but this is not a necessary step. As mentioned in the introduction, Parallel Inspector should analyze projects containing Cilk Plus code without crashing. With the exceptions given above, the results of either Memory or Thread Checking may contain false positives but should be otherwise complete. For additional help, please post a question on the Intel Parallel Studio forum or submit an issue using http://premier.intel.com.

Теги:
Пожалуйста, обратитесь к странице Уведомление об оптимизации для более подробной информации относительно производительности и оптимизации в программных продуктах компании Intel.
Возможность комментирования русскоязычного контента была отключена. Узнать подробнее.