Efficient Parallelization

By Ronald W Green,

Published:11/07/2013   Last Updated:08/29/2014

Compiler Methodology for Intel® MIC Architecture

Efficient Parallelization


This article is part of the Intel® Modern Code Developer Community documentation which supports developers in leveraging application performance in code through a systematic step-by-step optimization framework methodology. This article addresses: Thread level parallelization.


This chapter covers topics in parallelization. There are links to various parallelization methods and resources along with tips and techniques for getting optimal parallel performance.


In this chapter, you will learn techniques for the Intel OpenMP* runtime library provided with the Intel compilers, Intel® MPI, and Intel® Threading Building Blocks (Intel® TBB).


The following sub-chapters provide more information on parallelization topics. Click the links below to access these topics.

Take Aways

In this chapter, various parallelization methods were presented. For OpenMP, two major performance techniques were presented: controlling thread affinity and controlling OpenMP scheduling.


It is essential that you read this guide from start to finish using the built-in hyperlinks to guide you along a path to a successful port and tuning of your application(s) on Intel® Xeon Phi™ architecture. The paths provided in this guide reflect the steps necessary to get best possible application performance.

The next chapter, Vectorization Essentials, covers techniques to help vectorize your code along with best methods for efficient vectorization.

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804