This article discussions parallelization and provides links that will help you understand your programming environment and evaluate the suitability of your app.
Cache Blocking Techniques Overview
Memory Layout Transformations Overview
Get an overview of parallelization using the Intel® MPI Library and links to additional documentation.
Optimization reports from the Intel® compilers guide the developer with optimization details
Reference Link and Download
Intel Vectorization Tools
The latest Intel® Architecture Instruction Set Extensions Programming Reference includes the definition of Intel® Advanced Vector Extensions 512 (Intel® AV
Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
This article focuses on the steps to improve software performance with vectorization. Included are examples of full applications along with some simpler cases to illustrate the steps to vectorization.
This article is part of the Intel® Modern Code Developer Community documentation which supports developers in leveraging application performance in code through a systematic step-by-step optimization framework methodology. This article addresses: Thread level parallelization.