The prior part (2) of this blog provided a header and set of function that can be used to determine the logical core and logical Hyper-Thread number within the core. This determination is to be use in an optimization strategy called the Hyper-Thread Phalanx.
For the Intel® Compiler, vectorization is the unrolling of a loop combined with the generation of packed SIMD instructions. Because the packed instructions operate on more than one data element at a time, the loop can execute more efficiently. The above message indicates that the loop was successfully vectorized using packed SIMD instructions.
The vectorizer cannot safely use aligned loads or stores for this data access, either because the data are not aligned to an n-byte boundary in memory, or because the compiler does not know the alignment. The compiler must use unaligned memory accesses, which may be less efficient. The value of n depends on the targeted instruction set and corresponds to the width of the vector instructions: 16 for Intel® SSE, 32 for Intel® AVX and 64 for Intel® AVX-512 instructions.
This free webinar series presented tools, tips, and techniques that will help sharpen your development skills on Intel processors/coprocessors including Intel® Xeon® processor and Intel® Xeon Phi™ coprocessor. Intel technical experts as well as open source innovators discuss topics ranging from compiler techniques including vectorization & OpenMP 4.0, performance libraries, debugging, error checking and tuning to boost application and platform performance. Come to the live sessions with your programming questions for Intel technical experts to answer.
Intel® Parallel Studio XE 2013 SP1 parallel software development suite combines Intel's C/C++ compiler and Fortran compiler; performance and parallel libraries; error checking, code robustness, and performance profiling tools into a single suite offering. This new product release includes:
- Page 1