Developer Guide and Reference

Contents

OpenMP* Examples

The following examples show how to use several OpenMP* features.

A Simple Difference Operator

This example shows a simple parallel loop where the amount of work in each iteration is different. Dynamic scheduling is used to improve load balancing.
The
for
has a
nowait
because there is an implicit barrier at the end of the parallel region.
Example
void for1(float a[], float b[], int n) { int i, j; #pragma omp parallel shared(a,b,n) { #pragma omp for schedule(dynamic,1) private (i,j) nowait for (i = 1; i < n; i++) for (j = 0; j < i; j++) b[j + n*i] = (a[j + n*i] + a[j + n*(i-1)]) / 2.0; } }

Two Difference Operators:
for
Loop Version

The example uses two parallel loops fused to reduce fork/join overhead. The first
omp for
pragma
has a
nowait
clause because all the data used in the second loop is different than all the data used in the first loop.
Example
void for2(float a[], float b[], float c[], float d[], int n, int m) { int i, j; #pragma omp parallel shared(a,b,c,d,n,m) private(i,j) { #pragma omp for schedule(dynamic,1) nowait for (i = 1; i < n; i++) for (j = 0; j < i; j++) b[j + n*i] = ( a[j + n*i] + a[j + n*(i-1)] )/2.0; #pragma omp for schedule(dynamic,1) nowait for (i = 1; i < m; i++) for (j = 0; j < i; j++) d[j + m*i] = ( c[j + m*i] + c[j + m*(i-1)] )/2.0; } }

Two Difference Operators:
sections
Version

The example demonstrates the use of the
omp sections
pragma
. The logic is identical to the preceding
omp for
example, but uses
omp sections
instead of
omp for
. Here the speedup is limited to two because there are only two units of work whereas in the example above there are
(n-1) + (m-1)
units of work.
Example
void sections1(float a[], float b[], float c[], float d[], int n, int m) { int i, j; #pragma omp parallel shared(a,b,c,d,n,m) private(i,j) { #pragma omp sections nowait { #pragma omp section for (i = 1; i < n; i++) for (j = 0; j < i; j++) b[j + n*i] = ( a[j + n*i] + a[j + n*(i-1)] )/2.0; #pragma omp section for (i = 1; i < m; i++) for (j = 0; j < i; j++) d[j + m*i] = ( c[j + m*i] + c[j + m*(i-1)] )/2.0; } } }

Updating a Shared Scalar

This example demonstrates how to use a
single
construct to update an element of the shared array
a
. The optional
nowait
clause after the first loop is omitted because it is necessary to wait at the end of the loop before proceeding into the
single
construct.
Example
void sp_1a(float a[], float b[], int n) { int i; #pragma omp parallel shared(a,b,n) private(i) { #pragma omp for for (i = 0; i < n; i++) a[i] = 1.0 / a[i]; #pragma omp single a[0] = MIN( a[0], 1.0 ); #pragma omp for nowait for (i = 0; i < n; i++) b[i] = b[i] / a[i]; } }

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804