Intel® Advisor

FLOPS Metrics

Data collected with the FLOPS analysis appears in the Survey report in the expandable column. While collapsed, FLOPS column provides only GFLOPS and AI data.

Expanded FLOPS column provides the following metrics:

Task Interactions and Suitability

If your tasks access the same memory locations, then, left to themselves, they will tend to trip over each other. You can solve this by adding synchronization code to make sure the tasks are well-behaved when they access shared memory locations, but synchronization code can be tedious to add and hard to get right, and it is easy to end up with tasks that spend more time doing synchronization than doing work.

Enabling Task Chunking

Chunking means that the parallel framework will merge several tasks into a single task, with little or no overhead between them. For instance, if tasks are loop iterations, chunking would mean that several iterations are executed together (as a chunk) before heavyweight task control is performed.

Chunking is typically implemented when you convert to a parallel framework:

OpenMP*

OpenMP* is a parallel programming framework for C, C++, or Fortran code. Using OpenMP requires few source changes and is supported by multiple compilers. Because OpenMP is supported by OpenMP libraries, you modify your source code with compiler directives rather than using types, variables, and calls. An OpenMP program can often be changed from parallel execution to serial execution by omitting a compiler option so the compiler ignores the OpenMP directives.

Legal Information

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

Testing the Intel Cilk Plus Synchronization Code

After you add Intel Cilk Plus synchronization code (such as reducers/mutexes), but before adding the constructs that cause the program to use parallel execution, you should test your serial program. The synchronization code may introduce problems if you have inadvertently used a non-recursive mutex in a recursive context, or if your edits accidentally changed some other piece of program behavior.

Subscribe to Intel® Advisor