The key to performance measurement is two-fold, know exactly what you are measuring and collect your baseline data. Next, profile your application and identify a specific and realistic performance goal based on the profiling data. Follow these steps to optimize your software.
The Intel Compilers provide a number of features for generating vectorized code. Auto-vectorization is the method used by the Intel Compilers to generate vectorized code for a given application without requiring code changes. Developers can also implement simple coding changes in the source code to enforce vectorization behavior.
Proven techniques for code optimizations and change recommendations are listed here. Note that these recommendations depend entirely upon the application.
Code changes may be required in order to facilitate vectorization even further. Once a developer has made changes to the code, how does one that the changes elicit the expected behavior? Use of special compiler optimization reports to guide source code changes and verify that the code does indeed vectorize.
The techniques offering the most control require greater application knowledge and skill in knowing where they should be applied. But these more intensive techniques, such as intrinsics, can result in greater performance when properly used.