User Guide

Contents

Common Issues When Adding Parallelism

The types of problems encountered by parallel programs include shared memory data conflicts and incorrect locking.

Shared Memory Problems

Introducing parallelism can result in unexpected problems when parallel tasks access the same memory location. Such problems are known as
data races
. For example, in the
Primes
sample, the following line calls the function
Tick():
  if (IsPrime(p)) Tick();
The called function
Tick()
increments the global variable
primes
:
void Tick() { primes++; }
Consider the following scenario, where the value of
primes
is incremented only once instead of twice:
Time
Thread 0
Thread 1
T1
Enters function
Tick()
T2
Enters function
Tick()
T3
Load value of
primes
T4
Load value of
primes
T5
Increment loaded value
T6
Store value of
primes
T7
Increment loaded value
T8
Store value of
primes
T9
Return
T10
Return
If you run this as a serial program, this problem does not occur. However, when you run it with multiple threads, the tasks may run in parallel and
primes
may not be incremented enough.
Such problems are non-deterministic, difficult to detect, and at first glance might seem to occur at random. The results can vary based on multiple factors, including the workload on the system, the data being processed, the number of cores, and the number of threads.
It is possible to use
locks
to restrict access to a shared memory location to one task at a time. However, all implementations of locks add overhead. It is more efficient to avoid the sharing by replicating the storage. This is possible if data values are not being communicated between the tasks, even though the memory locations are being reused.

Lock Problems

One thread (thread A) may have to wait for another thread (thread B) to release a lock before it can proceed. The core executing thread A is not performing useful work. This is a case of lock contention. In addition, thread B may be waiting for thread A to release a different lock before it can proceed. Such a condition is called a
deadlock
.
Like a data race, a deadlock can occur in a non-deterministic manner. It might occur only when certain factors exist, such as the workload on the system, the data being processed, or the number of threads.

Ensuring the Parallel Portions of a Program are Thread Safe

Intel® Advisor
can detect many problems related to parallelism. Because it only analyzes the serial execution of your program,
Intel Advisor
cannot detect all possible errors. When you have finished using
Intel Advisor
to introduce parallelism into your program, you should use the
Intel® Inspector
and other Intel software suite products. These tools and using a debugger can detect parallelism problems that normal testing will not detect, and can also identify times when the cores are idle.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804