Efficient Parallelization

By Ronald W Green,

Published: 11/07/2013   Last Updated: 08/29/2014

Compiler Methodology for Intel® MIC Architecture

Efficient Parallelization


This article is part of the Intel® Modern Code Developer Community documentation which supports developers in leveraging application performance in code through a systematic step-by-step optimization framework methodology. This article addresses: Thread level parallelization.


This chapter covers topics in parallelization. There are links to various parallelization methods and resources along with tips and techniques for getting optimal parallel performance.


In this chapter, you will learn techniques for the Intel OpenMP* runtime library provided with the Intel compilers, Intel® MPI, and Intel® Threading Building Blocks (Intel® TBB).


The following sub-chapters provide more information on parallelization topics. Click the links below to access these topics.

Take Aways

In this chapter, various parallelization methods were presented. For OpenMP, two major performance techniques were presented: controlling thread affinity and controlling OpenMP scheduling.


It is essential that you read this guide from start to finish using the built-in hyperlinks to guide you along a path to a successful port and tuning of your application(s) on Intel® Xeon Phi™ architecture. The paths provided in this guide reflect the steps necessary to get best possible application performance.

The next chapter, Vectorization Essentials, covers techniques to help vectorize your code along with best methods for efficient vectorization.

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.