Parallelize Data - Intel TBB Counted Loops

When tasks are loop iterations, and the iterations are over a range of values that are known before the loop starts, the loop is easily expressed in Intel TBB.

Consider the following serial code and the need to add parallelism to this loop:

    ANNOTATE_SITE_BEGIN(sitename);
        for (int i = lo; i < hi; ++i) {
            ANNOTATE__ITERATION_TASK(taskname);
                statement;
        }
    ANNOTATE_SITE_END();

Here is the serial example converted to use Intel TBB, after you remove the Intel Advisor annotations:

#include <tbb/tbb.h>
    ...
    tbb::parallel_for( lo, hi, 
        [&](int i) {statement;}
    );

The first two parameters are the loop bounds. As is typical in C++ (especially STL) programming, the lower bound is inclusive and the upper bound is exclusive. The third parameter is the loop body, wrapped in a lambda expression. The loop body will be called in parallel by threads created by Intel TBB. As described before in Create the Tasks, Using C++ structs Instead of Lambda Expressions, the lambda expressions can be replaced with instances of explicitly defined class objects.

See Also

For more complete information about compiler optimizations, see our Optimization Notice.