Compiler Methodology for Intel® MIC Architecture
This article is part of the Intel® Modern Code Developer Community documentation which supports developers in leveraging application performance in code through a systematic step-by-step optimization framework methodology. This article addresses: Thread level parallelization.
This chapter covers topics in parallelization. There are links to various parallelization methods and resources along with tips and techniques for getting optimal parallel performance.
In this chapter, you will learn techniques for the Intel OpenMP* runtime library provided with the Intel compilers, Intel® MPI, Intel® Cilk™ Plus, and Intel® Threading Building Blocks (Intel® TBB).
The following subchapters provide more information on parallelization topics. Click the links below to access these topics.
Choosing the Right Threading Framework - A comparison of different Parallel Programming options
In this chapter, various parallelization methods were presented. For OpenMP, two major performance techniques were presented: controlling thread affinity and controlling OpenMP scheduling.
It is essential that you read this guide from start to finish using the built-in hyperlinks to guide you along a path to a successful port and tuning of your application(s) on Intel® Xeon Phi™ architecture. The paths provided in this guide reflect the steps necessary to get best possible application performance.
The next chapter, Vectorization Essentials, covers techniques to help vectorize your code along with best methods for efficient vectorization.