by Aaron Coday
In this course and its accompanying labs, you will become familiar with intermediate to advanced techniques for explicit threading and OpenMP* threading. You’ll demonstrate your understanding of explicit threading by adding GUI responsiveness to the Apfel* application. You’ll demonstrate your understanding of OpenMP threading by improving the performance of the fractal calculation that the program is executing.
To complete the Advanced Multi-Threading labs included in this course, you’ll need the following tools:
- Microsoft Visual Studio*
- Intel® Compiler 7.1 or later
- Intel® Threading Toolkit (recommended, but not required)
Download the Apfel source code (http://download.intel.com/software/products/college/advMT/lab/CONTEST/apfel.htm) and other files you will need for the labs. Please download and extract these files before continuing with the course.
This course is divided into two parts, each structured around one of the multi-threading labs. Each section starts with a description of the multi-threading problem to be addressed, followed by detailed instructions as to a possible solution. Each of the two sections then concludes with a lab activity, in which you will implement the proposed multi-threading solution.
Part 1: Responsiveness
- Launch Microsoft Visual Studio.
- Open workspace from C:LabCONTESTapfelapfel.dsw.
- Make sure that Intel Compiler is selected.
- Build the application by selecting Release build and then Build (F7).
- Press Ctrl-F5 to run the application.
Adding thread function to CApfelRun
Basically you add a thread function to CApfelRun and then take care of starting and passing the necessary information into the thread. The new thread is responsible for performing the DoRun method.
Part 2: Performance
You can use multi-threading to add extra functionality, to increase performance, or both. You should know and be able to use both explicit threading (Win32*) and OpenMP*.
Intel® Threading Toolkit
- Intel® Thread Checker
- Thread Profiler
- Intel® VTune™ Performance Analyzer
Intel® Thread Checker
- Locate threading bugs in applications on IA-32 systems running Windows*
- Use remote collectors to locate threading bugs in applications on IA-32 and Itanium®-based systems running Linux*.
Running Intel® Thread Checker
Statistics collected within VTune™ analyzer
- Compile with icl /Qopenmp_profile (/MD /Qopenmp)
Statistics collected outside VTune analyzer
- Compile with icl /Qopenmp_profile
- Run program outside VTune environment
- Import guide.gvs statistics file into VTune analyzer
To import guide.gvs files, simply do File/Open File for OpenMP Statistics (*.gvs) files.
- For Windows*, locate performance bottlenecks in Win32* and OpenMP* threaded applications
- For Linux*, now you can locate performance bottlenecks in POSIX* and OpenMP threaded applications, from a host Windows system
- View graphic displays that show each thread's state and parallel-serial transitions to confirm that performance is meeting expectations or where it is falling short - helps you decide where to focus optimization efforts
Intel® VTune™ Performance Analyzer
- Links to source view
- Error context
- Error locations
- Stack trace
Appendix – Win32 Threads
The following is a review of common Win32 threading functions.
Creating Win32* Threads
Waiting for Kernel Objects
This is the hub function for synchronization.
DWORD WaitForSingleObject ( HANDLE hHandle, DWORD dwMilliseconds); // Timeout (0 .. INFINITE)
HANDLE CWnd- HANDLE CWnd->PostMessage( UINT message, // Message (WM_DONE) WPARAM wParam, LPARAM lParam ); // Additional Message info