Developer Guide and Reference

Contents

OpenMP* Advanced Issues

This topic discusses how to use the OpenMP* library functions and environment variables and discusses some guidelines for enhancing performance with OpenMP*.
OpenMP* provides specific function calls, and environment variables. See the following topics to refresh you memory about the primary functions and environment variable used in this topic:
To use the function calls, include the
omp.h
header file
.
This file is
installed in the
INCLUDE
directory during the compiler installation, and compile the application using the
[Q]openmp
option.
The following example, which demonstrates how to use the OpenMP* functions to print the alphabet, also illustrates several important concepts:
  1. When using functions instead of
    pragmas,
    your code must be rewritten; rewrites can mean extra debugging, testing, and maintenance efforts.
  2. It becomes difficult to compile without OpenMP* support.
  3. it is very easy to introduce simple bugs, as in the loop (below) that fails to print all the letters of the alphabet when the number of threads is not a multiple of 26.
  4. You lose the ability to adjust loop scheduling without creating your own work-queue algorithm, which is a lot of extra effort. You are limited by your own scheduling, which is mostly likely static scheduling as shown in the example.
Example
#include <stdio.h> #include <omp.h> int main(void) { int i; omp_set_num_threads(4); #pragma omp parallel private(i) { // OMP_NUM_THREADS is not a multiple of 26, // which can be considered a bug in this code. int LettersPerThread = 26 / omp_get_num_threads(); int ThisThreadNum = omp_get_thread_num(); int StartLetter = 'a'+ThisThreadNum*LettersPerThread; int EndLetter = 'a'+ThisThreadNum*LettersPerThread+LettersPerThread; for (i=StartLetter; i<EndLetter; i++) { printf("%c", i); } } printf("\n"); return 0; }
Debugging threaded applications is a complex process because debuggers change the run-time performance, which can mask race conditions. Even
print
statements can mask issues, because they use synchronization and operating system functions. OpenMP* itself also adds some complications, because it introduces additional structure by distinguishing private variables and shared variables, and inserts additional code. A debugger that supports OpenMP* can help you to examine variables and step through threaded code. You can use Intel® Inspector to detect many hard-to-find threading errors analytically. Sometimes, a process of elimination can help identify problems without resorting to sophisticated debugging tools. <