Developer Guide and Reference

Contents

Programming with Auto-parallelization

The auto-parallelization feature implements
some
concepts
of OpenMP*,
such as the worksharing construct (with the
PARALLEL
for
directive). This section provides details on auto-parallelization.

Guidelines for Effective Auto-parallelization Usage

A loop can be parallelized if it meets the following criteria:
  • The loop is countable at compile time: This means that an expression representing how many times the loop will execute (loop trip count) can be generated just before entering the loop.
  • There are no
    FLOW
    (
    READ
    after
    WRITE
    ),
    OUTPUT
    (
    WRITE
    after
    WRITE
    ) or
    ANTI
    (
    WRITE
    after
    READ
    ) loop-carried data dependencies. A loop-carried data dependency occurs when the same memory location is referenced in different iterations of the loop. At the compiler's discretion, a loop may be parallelized if any assumed inhibiting loop-carried dependencies can be resolved by run-time dependency testing.
The compiler may generate a run-time test for the profitability of executing in parallel for loop, with loop parameters that are not compile-time constants.
Coding Guidelines
Enhance the power and effectiveness of the auto-parallelizer by following these coding guidelines:
  • Expose the trip count of loops whenever possible; use constants where the trip count is known and save loop parameters in local variables.
  • Avoid placing structures inside loop bodies that the compiler may assume to carry dependent data, for example, procedure calls, ambiguous indirect references or global references.

Auto-parallelization Data Flow

For auto-parallelization processing, the compiler performs the following steps:
  1. Data flow analysis:
    Computing the flow of data through the program.
  2. Loop classification:
    Determining loop candidates for parallelization based on correctness and efficiency
    , as shown by