Using Intel® C++ Composer XE 2011 for Linux to Thread Your Applications

Tachyon is a ray-tracer application, rendering objects described in data files. The Tachyon program is located in the product samples directory: <install-dir>/composerxe/Samples/<locale>/C++/tachyon_compiler.tar.gz. By default we use balls.dat as the input file. Data files are stored in the directory tachyon/dat. Originally, Tachyon was an application with parallelism implemented in function pthread_create() with explicit threads: one thread does the rendering, and the other makes calculations. In this tutorial we implement parallelization on the calculation thread with OpenMP*, Intel® Threading Building Blocks (Intel® TBB) and Intel® Cilk™ Plus technologies. Parallelization is implemented only for one function draw_task(), which you can find in the source file src/build_serial/build_serial.cpp.

Prerequisites

  • Intel® C++ Composer XE 2011
  • X Windows System

Before you begin

1. Copy the archive file tachyon_compiler.tgz to a working directory where you have read/write access.

2. Uncompress the archive:

%tar -zxf tachyon.tgz

A tachyon directory will appear.

3. Change directory to tachyon:

%cd tachyon

4. Set up the Intel® C++ Composer XE 2011 environment

%source <compiler root>/bin/intel64/compilervars.sh {ia32/intel64}

Where compiler root is where you have Intel® C++ Composer XE 2011 installed and you choose either ia32 or intel64 as a paramenter depending on what platform you are targeting, 32-bit or 64-bit.


Workflow Steps

In the following, we will be building different parallel implementations of the same function. When executed, the application will display the execution time required to render the object in the window title. This time is an indication of the speedup obtained with parallel implementations compared to a baseline established with a serial implementation in the first step.  Any source changes indicated in the steps below are indicated in the source itself using a "Todo" keyword in the comments.


Building the Serial Version

1. Build and run serial Tachyon:

%make serial

2. Note the time to render the image for your baseline performance measurement


Building with OpenMP*

1. Remove all files which were created in previous step:

%make clean

2. Open source file src/build_with_openmp/build_with_openmp.cpp in your favorite editor

3. Uncomment the OpenMP* pragmas in the routine draw_task which create parallel region and distribute loop iteration within the team of threads

4. Comment out the return inside parallel region in the routine draw_task.

5. Uncomment the zero assignment to variable ison (ison = 0;) inside the parallel region in the routine draw_task.

6. Uncomment the return at the end of the routine draw_task .

7. Build and run Tachyon with OpenMP* parallelization:

%make openmp

8. Measure performance compared with the serial version

Options that use OpenMP are available for both Intel® and non-Intel microprocessors, but these options may perform additional optimizations on Intel® microprocessors than they perform on non-Intel microprocessors. The list of major, user-visible OpenMP constructs and features that may perform differently on Intel® vs. non-Intel microprocessors includes: locks (internal and user visible), the SINGLE construct, barriers (explicit and implicit), parallel loop scheduling, reductions, memory allocation, and thread affinity and binding.


Building with Intel® TBB

1. Remove all files which were created in previous step:

%make clean

2. Open the source file src/build_with_tbb/build_with_tbb.cpp in your favorite editor.

3. Uncomment the Intel TBB header files.

4. Uncomment the class draw_task.

5. Comment out the routine draw_task.

6. Uncomment the lines regarding TBB schedule and number of threads in routine thread_trace.

7. Uncomment the lines regarding grain size in routine thread_trace.

8. Uncomment the Intel TBB parallel_for routine in routine thread_trace.

9. Comment out the call of routine draw_task in routine thread_trace.

10. Build and run Tachyon with Intel® TBB parallelization:

%make tbb

11. Measure performance compared with the serial version


Building with Intel Cilk Plus

1. Remove all files which were created in previous step:

%make clean

2. Open source file src/build_with_cilk/build_with_cilk.cpp in your favorite editor.

3. Uncomment the Intel Cilk Plus header file.

4. Uncomment the routine draw_task related to Intel Cilk Plus implementation.

5. Comment out the serial draw_task function

6. Build and run Tachyon with Intel® Cilk Plus parallelization:

%make cilk

7. Measure performance compared with the serial version

Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.