Tips and techniques on using the Intel® Compilers to maximize your application performance.
Information about Intel® Integrated Performance Primitives (Intel® IPP) memory functions
Performance Tools for Software Developers - SSE generation and processor-specific optimizations continuedCan I combine the processor values and target more than one processor? How to generate optimized code for both Intel and AMD* architecture? Where can I find more information on processor-specific optimizations?
Loop blocking is a combination of strip mining and loop interchange to enhance reuse of local data. It helps the nested loops that manipulate arrays and are too large to fit into the cache. The loop blocking allows reuse of the arrays by transforming the
The article describes effect of /Qpar-threshold option when doing auto parallelization with Intel C++ compiler.
Warning about using runtime checks and optimization together. All optimizations will be disabled.
Vectorization is one of many optimizations that are enabled by default in the latest Intel compilers. In order to be vectorized, loops must obey certain conditions, listed below. Some additional ways to help the compiler to vectorize loops are described.
The compiler supports many options that tune or optimize an application for different Intel and non-Intel processors. Differences are explained, and the switches /arch, /Qx..., /Qax... (Windows*) and -m, -x..., -ax... (Linux*, Mac OS* X) are recommended.
MSC.Software SimXpert* is a fully integrated simulation environment for performing multidiscipline based analysis with a graphical interface designed to facilitate the end-to-end simulations. This article describes the threading of SimXpert.
There are various function you may use to find the computational time for IPP functions or IPP functions. The best method, we recommend is to use ippGetCpuClocks() from IPP itself.