User Guide

  • 2020
  • 06/18/2020
  • Public Content
Contents

Annotations and OpenMP* Code

This topic explains the steps needed to implement parallelism proposed by the
Intel Advisor
annotations by adding OpenMP* parallel framework code.
The recommended order for replacing the annotations with OpenMP code:
  1. Add appropriate synchronization of shared resources, using LOCK annotations as a guide.
  2. Test to verify you did not break anything, before adding the possibility of non-deterministic behavior with parallel tasks.
  3. Add code to create OpenMP parallel sections or equivalent, using the SITE/TASK annotations as a guide.
  4. Test with one thread to verify that your program still works correctly. For example, set the environment variable
    OMP_NUM_THREADS
    to 1 before you run your program.
  5. Test with more than one thread to see that the multithreading works as expected.
OpenMP creates worker threads automatically. In general, you should concern yourself only with the tasks, and leave it to the parallel frameworks to create and destroy the worker threads.
If you do need some control over creation and destruction of worker threads, see the compiler documentation. For example, to limit the number of threads, set the
OMP_THREAD_LIMIT
or the
OMP_NUM_THREADS
environment variable.
The table below shows the serial, annotated program code in the left column and the equivalent OpenMP C/C++ and Fortran parallel code in the right column for some typical code to which parallelism can be applied.
Serial C/C++ and Fortran Code with
Intel Advisor
Annotations
Parallel C/C++ and Fortran Code using OpenMP
// Synchronization, C/C++ ANNOTATE_LOCK_ACQUIRE(0); Body(); ANNOTATE_LOCK_RELEASE(0);
// Synchronization can use OpenMP // critical sections, atomic operations, locks, // and reduction operations (shown later)
! Synchronization, Fortran call annotate_lock_acquire(0) body call annotate_lock_release(0)
// Synchronization can use OpenMP // critical sections, atomic operations, locks, // and reduction operations (shown later)
// Parallelize data - one task within a // C/C++ counted loop ANNOTATE_SITE_BEGIN(site); for (i = lo; i < n; ++i) { ANNOTATE_ITERATION_TASK(task); statement; } ANNOTATE_SITE_END();
// Parallelize data - one task, C/C++ counted loops #pragma omp parallel for for (int i = lo; i < n; ++i) { statement; }
! Parallelize data - one task within a ! Fortran counted loop call annotate_site_begin("site1") do i = 1, N call annotate_iteration_task("task1") statement end do call annotate_site_end
! Parallelize data - one task with a ! Fortran counted loop !$omp parallel do do i = 1, N statement end do     !$omp end parallel do
// Parallelize C/C++ functions ANNOTATE_SITE_BEGIN(site); ANNOTATE_TASK_BEGIN(task1); function_1(); ANNOTATE_TASK_END(); ANNOTATE_TASK_BEGIN(task2); function_2(); ANNOTATE_TASK_END(); ANNOTATE_SITE_END();
// Parallelize C/C++ functions #pragma omp parallel //start parallel region { #pragma omp sections { #pragma omp section function_1(); #pragma omp section function_2(); } } // end parallel region
! Parallelize Fortran functions call annotate_site_begin("site1") call annotate_task_begin("task1") call subroutine_1 call annotate_task_end call annotate_task_begin("task2") call subroutine_2 call annotate_task_end call annotate_site_end
! Parallelize Fortran functions !$omp parallel ! start parallel region !$omp sections !$omp section call subroutine_1 !$omp section call subroutine_2 !$omp end sections !$omp end parallel ! end parallel region

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804