Building the OpenMP* Version

To build the OpenMP* version, you will modify the sample application to use OpenMP* parallelization and then compile the modified code. You will then start the application and compare the time with the baseline performance time.

  1. Set the build_with_openmp project as the startup project.

  2. For project build_with_openmp, change the compiler to the Intel® C++ Compiler (Project > Intel Compiler > Use Intel C++).

  3. For the project build_with_openmp, make sure the /Qopenmp compiler option is set (Project > Properties > Configuration Properties > C/C++ > Language [Intel C++] > OpenMP Support = Generate Parallel Code (/Qopenmp)). This option is required to enable the OpenMP* extension in the compiler

  4. Open the source file tachyon.openmp.cpp in the project build_with_openmp.
  5. Change the following in the parallel_thread function:

    • Move the iteration-independent value of mboxsize out of the loop.
    • Remove the validity check of video->next_frame.
      • Exiting a loop in the middle of a parallelized loop is not permitted.
      • The iterations we save from this check will be distributed without affecting the result.
    • Add a #pragma omp parallel for to the outermost for loop to maximize the work done per thread.
    • Check against the complete change shown in tachyon.openmp_solution.cpp.

The makefile automatically runs the sample after it is built.

Compare the time to render the image to the baseline performance time.

If you wish to explicitly set the number of threads, you can set the environment variable OMP_NUM_THREADS=N where N is the number of threads. Alternatively, you can use the function void omp_set_num_threads(int nthreads) that is declared in omp_lib.h. Make sure to call this function before the parallel region is defined.

Options that use OpenMP* are available for both Intel and non-Intel microprocessors, but these options may perform additional optimizations on Intel® microprocessors than they perform on non-Intel microprocessors. The list of major, user-visible OpenMP* constructs and features that may perform differently on Intel versus non-Intel microprocessors includes:

  • Internal and user visible locks

  • The SINGLE construct

  • Explicit and implicit barriers

  • Parallel loop scheduling

  • Reductions

  • Memory allocation

  • Thread affinity and binding

For more complete information about compiler optimizations, see our Optimization Notice.