User Guide

Contents

Independent Updates

Independent updates can occur when multiple tasks contribute to determining the final value of a memory location.

The Basic Pattern

Suppose that multiple tasks write to a memory location, that the value written by each task is computed using the previous value in that location, and that the order in which the tasks update the memory location does not matter.
For example, consider a loop that sums all the values in an array:
extern int x; // ... ANNOTATE_SITE_BEGIN(site1); for (i = 0; i != n; ++i) {     ANNOTATE_ITERATION_TASK(task1);     x = x + a[i]; } ANNOTATE_SITE_END(site1); printf("%d\n", x);
The sharing problem looks like this:
  1. Task 0 reads
    x
    .
  2. Task 1 reads
    x
    .
  3. Task 0 adds
    a[0]
    to the value it read and stores the result back in
    x
    .
  4. Task 1 adds
    a[1]
    to the value it read and stores the result back in
    x
    , overwriting the value stored by task 0.
The important fact is that the updates of
x
can be performed in any order. All you need to do is to make sure that no task can write to
x
between the read from
x
and the write to
x
in any other task; the uses of
x
in the tasks are otherwise independent.

Reductions

Reductions
are a special case of the independent update pattern. The reduction pattern occurs when a loop combines a collection of values using a commutative, associative function.
In Adding Parallelism to Your Program, you will see that the Intel® Threading Building Blocks
and OpenMP*
parallel frameworks have special features for writing parallel reductions.

Transactions

In a more general form of this pattern, there may be multiple memory locations which must be updated together.
void insert_node_in_list(T *new_node, T *insert_after) {     new_node->next = insert_after->next;     new_node->prev = insert_after->next->prev;     insert_after->next->prev = new_node;     insert_after->next = new_node; }
Two insertions must not occur simultaneously, but the insertions may occur in any order, as long as the final list order does not matter.
A collection of updates that must all occur together is referred to as a
transaction
.

Guard Variables

A special case is the use of a shared memory location to control some additional code. The update and the code that depends on it may be treated as a transaction.
bool initialized = false; void do_something() {     if (!initialized) {         do_the_initialization();         initialized = true;     }     do_the_real_work(); }
If
do_something()
is called from multiple tasks, then the sharing problem is:
  1. Task 0 reads
    initialized
    , which is false, and enters the body of the
    if
    statement.
  2. Task 1 reads
    initialized
    , which is false, and enters the body of the
    if
    statement, so
    do_the_initialization()
    is called twice.
It does not matter which task the initialization occurs in, so your only problem is to make sure that other tasks wait until this initialization has happened.

Independent Writes

The simplest case occurs when the value that the tasks write to the memory location does not depend on its previous value:
bool found = false; ANNOTATE_SITE_BEGIN(site1); for (i = 0; i != n; ++i) {     ANNOTATE_ITERATION_TASK(task1);     if (a[i] == b) found = true; } if (found) printf("found\n"); ANNOTATE_SITE_BEGIN(site1);
There is no read to keep together with the write, and it does not matter what order the writes to found occur in, so the tasks are totally independent, and can execute concurrently without restrictions. If a task writes to
found
at all, it will write the value true.
Note that if the task body were the following, then this example would fit the reduction pattern:
found = found || (a[i] == b);
This is also called a benign race because the program will always compute the same value, regardless of which thread does the last write.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804