Developer Guide and Reference

  • 2021.2
  • 04/07/2021
  • Public Content
  • Download as PDF
Contents

parallel/noparallel

Resolves dependencies to facilitate auto-parallelization of the immediately following loop (parallel) or prevents auto-parallelization of the immediately following loop (noparallel).

Syntax

#pragma parallel
[
clause
[ [
,
]
clause
]
...
]
#pragma noparallel
Arguments
clause
Can be any of the following:
always [assert]
Overrides compiler heuristics that estimate whether parallelizing a loop would increase performance. Using this clause on a loop that the compiler finds to be parallelizable tells the compiler to parallelize the loop even if doing so might not improve performance.
If
assert
is added, the compiler will generate an error-level assertion test to display a message saying that the compiler efficiency heuristics indicate that the loop cannot be vectorized.
firstprivate
(
var
[ :
expr
] ... )text
Provides a superset of the functionality provided by the
private
clause. Variables that appear in a
firstprivate
list are subject to
private
clause semantics. In addition, its initial value is broadcast to all private instances upon entering the parallel loop.
lastprivate
(
var
[ :
expr
] ... )
Provides a superset of the functionality provided by the
private
clause. Variables that appear in a
lastprivate
list are subject to private clause semantics. In addition, when the parallel region is exited, each variable has the value that results from the sequentially last iteration of the loop up exiting the parallel loop.
num_threads
(
n
)
Parallelizes the loop across
n
threads, where
n
is an integer.
private
(
var
[ :
expr
] ...)
Specifies a list of scalar and array variables (
var
) to privatize. An array or pointer variable can take an optional argument (
expr
) which is an int32 or int64 expression denoting the number of array elements to privatize.
Like the
private
clause, both the
firstprivate
, and the
lastprivate
clauses specify a list of scalar and array variables (
var
) to privatize. An array or pointer variable can take an optional argument (expr) which is an int32 or int64 expression denoting the number of array elements to privatize.
The same
var
is not allowed to appear in both the
private
and the
lastprivate
clauses for the same loop.
The same
var
is not allowed to appear in both the
private
and the
firstprivate
clauses for the same loop.
When
expr
is absent, the
rules on
var
are the same as with OpenMP. The
rules to be observed are as follows:
  • var
    must not be part of another variable (as an array or structure element)
  • var
    must not have a
    const
    -qualified type unless it is of class type with a mutable member
  • var
    must not have an incomplete type or a reference type
  • if
    var
    is of class type (or array thereof), then it requires an accessible, unambiguous default constructor for the class type. Furthermore, if this
    var
    is in a
    lastprivate
    clause, then it also requires an accessible, unambiguous copy assignment operator for the class type.
When
expr
is present, the same rules apply, but
var
must be an array or a pointer variable.
  • If
    var
    is an array, then only its first
    expr
    elements are privatized. Without
    expr
    , the entire array is privatized.
  • If
    var
    is a pointer, then the first
    expr
    elements are privatized (element size given by the pointer’s target type). Without
    expr
    , only the pointer variable itself is privatized.
  • Program behavior is undefined if
    expr
    evaluates to a non-positive value, or if it exceeds the array size.
Description
The
parallel
pragma instructs the compiler to ignore potential dependencies that it assumes could exist and which would prevent correct parallelization in the immediately following loop. However, if dependencies are proven, they are not ignored.
The
noparallel
pragma prevents autoparallelization of the immediately following loop.
Use this pragma with care. If a loop has cross-iteration dependencies, annotating it with this pragma can lead to incorrect program behavior.
Only use the
parallel
pragma if it is known that parallelizing the annotated loop will improve its performance.
Example: Using the
parallel
pragma
void example(double *A, double *B, double *C, double *D) {   int i;   #pragma parallel   for (i=0; i<10000; i++) {     A[i] += B[i] + C[i];     C[i] += A[i] + D[i];   } }

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.