Download page for the latest Intel® Software Development Emulator
The purpose of this document is to help developers determine which FFT, Intel® MKL or Intel® IPP is best suited for their application.
When confronted with nested loops, the granularity of the computations that are assigned to threads will directly affect performance. Loop transformations such as splitting and merging nested loops can make parallelization easier and more productive.
One key to attaining good parallel performance is choosing the right granularity for the application. Granularity is the amount of real work in the parallel task. If granularity is too fine, then performance can suffer from communication overhead.
How to configure OpenMP in the Intel IPP library to maximize multi-threaded performance of the Intel IPP primitives.
A step-by-step introduction to application performance tuning using the Intel® Compilers version 13 for IA-32 and Intel® 64 processors that are included with Intel® Parallel Studio XE 2013
Multidimensional Fast Fourier Transform (FFT) - selecting optimal sizes and data layout
This article describes a method to compile and run a distributed memory coarray program using Intel® Parallel Studio XE Cluster Edition for Linux . An example using Linux* is presented.
List of Intel IPP functions optimized for processor code name Haswell and Skylake
Compiler Methodology for Intel® MIC Architecture