User Guide

Contents

Enabling
Task Chunking

Chunking
means that the parallel framework will merge several tasks into a single task, with little or no overhead between them. For instance, if tasks are loop iterations, chunking would mean that several iterations are executed together (as a chunk) before heavyweight task control is performed.
Chunking is typically implemented when you convert to a parallel framework:
  • With
    Intel® Threading Building Blocks (Intel® TBB)
    , by using a
    parallel_for()
    instance.
  • With OpenMP*, by using the C/C++ pragma
    #pragma omp parallel for
    or the Fortran directive
    !$omp parallel do
    .
You can also restructure your code to enable chunking. This can be done by modifying a single loop to create a new outer loop where the two loops cover the same iteration space. A technique called strip-mining allows the inner loop to use vector operations in small chunks. Loop vectorization allows hardware to process data independently in smaller units (usually 64-byte), such as operations on data arrays.
Once these two loops exist, move the inner loop inside the task annotations so the task begin and end annotations encapsulate the inner loop. The outer loop strides by some chunk size, and the inner loop iterates sequentially through each chunk.
In cases where the CPU time and the elapsed time are about the same, the
Suitability Report
window under
Runtime impact for this site
may recommend that you enable task chunking.
If you check an item under to the right of the
Scalability of Maximum Site Gain
graph (such as
Enable Task Chunking
), its value will be added to the
Site Gain
and possibly the
Maximum Site Gain for All Sites
values.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804