User Guide

  • 2020
  • 06/18/2020
  • Public Content
Contents

Adding OpenMP Code to Synchronize the Shared Resources

OpenMP provides several forms of synchronization:
  • A
    critical section
    prevents multiple threads from accessing the critical section's code at the same time, thus only one active thread can update the data referenced by the code. A critical section may consist of one or more statements. To implement a critical section:
    • With C/C++:
      #pragma omp critical
    • With Fortran:
      !$omp critical
      and
      !$omp end critical
    Use the optional named form for a non-nested mutex, such as (C/C++)
    #pragma omp critical(name)
    or (Fortran)
    !$omp critical(name)
    and
    !$omp end critical(name)
    . If the optional
    (name)
    is omitted, it locks a single unnamed global mutex. The easiest approach is to use the unnamed form unless performance measurement shows this shared mutex is causing unacceptable delays.
  • An
    atomic operation
    allows multiple threads to safely update a shared numeric variable on hardware platforms that support its use. An atomic operation applies to only one assignment statement that immediately follows it. To implement an atomic operation:
    • With C/C++: insert a
      #pragma omp atomic
      before the statement to be protected.
    • With Fortran: insert a
      !$omp atomic
      before the statement to be protected.
    The statement to be protected must meet certain criteria (see your compiler or OpenMP documentation).
  • Locks
    provide a low-level means of general-purpose locking. To implement a lock, use the OpenMP types, variables, and functions to provide more flexible and powerful use of locks. For example, use the
    omp_lock_t
    type in C/C++ or the
    type=omp_lock_kind
    in Fortran. These types and functions are easy to use and usually directly replace
    Intel Advisor
    lock annotations.
  • Reduction operations
    can be used for simple cases, such as incrementing a shared numeric variable or summing an array into a shared numeric variable. To implement a reduction operation, add the
    reduction
    clause within a parallel region to instruct the compiler to perform the summation operation in parallel using the specified operation and variable.
  • OpenMP provides other synchronization techniques, including specifying a
    barrier
    construct where threads will wait for each other, an
    ordered
    construct that ensures sequential execution of a structured block within a parallel loop, and
    master
    regions that can only be executed by the master thread. For more information, see your compiler or OpenMP documentation.
The following topics briefly describe these forms of synchronization. Check your compiler documentation for details.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804