User Guide

Contents

Using Partially Parallel Programs with Intel® Advisor Tools

Intel® Advisor
tools are designed to collect data and analyze
serial
programs. If you have a partially parallel program,
before
you use the
Intel® Advisor
Suitability and Dependencies tools to examine it to add more parallelism, read the guidelines in this topic and modify your program so it runs as a serial program with a single thread within each parallel site.

Run Your Program as a Serial Program

To run the current version of your program as a serial program, you need to limit the number of threads to 1. To run your program with a single thread:
  • With
    Intel® Threading Building Blocks (Intel® TBB)
    , in the main thread create a
    tbb::task_scheduler_init init(1);
    object for the lifetime of the program and run the executable again. For example:
    int main() { tbb::task_scheduler_init init(1); // ...rest of program... return 0; }
    The effect of
    task_scheduler_init
    applies separately to each user-created thread. So if the program creates threads elsewhere, you need to create a
    tbb::task_scheduler_init init(1);
    for that thread's lifetime as well. Use of certain I
    Intel TBB
    features can prevent the program from running serially. For more information, see the
    Intel TBB
    documentation.
  • With OpenMP*, do one of the following:
    • Set the OpenMP* environment variable
      OMP_NUM_THREADS
      to 1 before you run the program.
    • Omit the compiler option that enables recognition of OpenMP pragmas and directives. On Windows* OS, omit
      /Qopenmp
      , and on Linux* OS omit
      -openmp
      .
For more information, see your compiler documentation.

Add or Remove
Intel® Advisor
Annotations

Intel® Advisor
site, task, and lock annotations are used by the Suitability and Dependencies tools. You can add
Intel® Advisor
parallel site and task annotations to mark the already parallel code regions. For example, the nqueens_Advisor sample
nqueens_cilk.cpp
:
... ANNOTATE_SITE_BEGIN(solve); cilk_for(int i=0; i<size; i++) { // try all positions in first row using separate array for each recursion ANNOTATE_ITERATION_TASK(setQueen); int * queens = new int[size]; setQueen(queens, 0, i); } ANNOTATE_SITE_END();
If needed, you can comment out annotations, or add preprocessor directives by using conditional compilation. For example, use the
#ifdef
,
#ifndef
, and
#endif
preprocessor directives:
... // Comment out the next line to hide the annotations. #define ANNOTATE_ON . . . #ifdef ANNOTATE_ON ANNOTATE_SITE_BEGIN(solve); #endif #ifndef ANNOTATE_ON // add parallel code here . . . #ifdef ANNOTATE_ON ANNOTATE_SITE_END(); #endif ...
After you add the parallel framework code and test it, you can remove the annotations.

Effect of Parallel Code on
Intel® Advisor
Tools' Reports

Because
Intel® Advisor
tools are designed to collect data and analyze
serial
program targets.
Parallel code that creates one or more threads within any annotated parallel site usually cause the Suitability or Dependencies tool reports to contain unreliable data. To use these two tools, there must be only a single thread within each parallel site. Also, when using parallel frameworks that use dynamic scheduling or work stealing at run-time, execution times can be assigned to the wrong source code.
If you use the Survey tool to profile your program, the
Self Time
in the Survey Report shows the sum of the CPU time for all threads. However, because
Intel® Advisor
's purpose is to analyze serial code, some of the time used by parallel code may be added to the wrong places. For example,
Self Time
may be added to the parallel framework run-time system entry points instead of the caller(s) in the thread that entered the parallel region. Also in the Survey Report, when examining parallel code, some entry points may be parallel framework run-time system entry points instead of the expected functions or loops. Similarly, in the Survey Source window, for a parallel code region the
Total Time
(and
Loop Time
) shows the sum of the CPU time for all threads.
Because
Intel® Advisor
's purpose is to analyze serial code, in the Suitability Report:
  • Intel® Advisor
    assumes there is only a single thread (no parallelism) within any annotated parallel site, including its task(s) and lock(s). When only a single thread executes within a parallel site (as expected), the results for
    that site
    may be correct. If the application has multiple parallel sites, and one or more sites were executed by multiple threads, the next two items apply.
  • If multiple threads execute within
    any
    parallel site, the reported
    Maximum Program Gain
    and that site's
    Impact on Program Gain
    values are not reliable. To obtain correct values, ensure that only a single thread executes for all parallel sites (see Run Your Program as a Serial Program above).
  • If multiple threads execute within a parallel site, the results for that site will be unpredictable and its values will not be reliable. Also, if one thread executes the parallel site annotations and a second thread executes the task annotation(s), the site may appear to not have any tasks and the tasks may appear to not execute within a site. To obtain correct values, ensure that only a single thread executes within each parallel site (see Run Your Program as a Serial Program above).
  • Any work-stealing constructs within the site will cause extra time to be added to the suspended site and/or task. All Suitability Report times are approximate.
Similarly in the Dependencies Report, if any parallel site uses multiple threads, this may prevent certain problems from being detected and reported by the Dependencies tool. To obtain correct values, ensure that only a single thread executes within each parallel site (see Run Your Program as a Serial Program above).

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804