Performance Tools for Software Developers - SSE generation and processor-specific optimizations continuedCan I combine the processor values and target more than one processor? How to generate optimized code for both Intel and AMD* architecture? Where can I find more information on processor-specific optimizations?
The purpose of this document is to help developers determine which FFT, Intel® MKL or Intel® IPP is best suited for their application.
Vectorization is one of many optimizations that are enabled by default in the latest Intel compilers. In order to be vectorized, loops must obey certain conditions, listed below. Some additional ways to help the compiler to vectorize loops are described.
The compiler supports many options that tune or optimize an application for different Intel and non-Intel processors. Differences are explained, and the switches /arch, /Qx..., /Qax... (Windows*) and -m, -x..., -ax... (Linux*, Mac OS* X) are recommended.
The Intel® C++ Compiler 11.1 Professional Edition now allows you to merge .dyn files with customized weighting.
The multi-core performance of a legacy Fortran benchmark unsuited to data parallelism is enhanced by threading using the TASK construct of OpenMP and the Intel Fortran Compiler. The necessary source code changes are explained in detail.
New BRNG SFMT19937 in Intel MKL
Guided Auto-Parallel - compiler feature providing guidance to user on what changes are necessary for the compiler to automatically add vectorization or parallelization to serial application.
The switch /Qtrapuv (-ftrapuv) for detecting certain uninitialized variables is designed to work with optimization disabled
The following article explains on using Intel® MKL with NumPy/SciPy, Matlab, C#, Java, Python, NAG, Gromacs, Gnu Octave, PETSc, HPL, HPCC, IMSL etc.