This chapter details some of the advanced compiler optimizations for performance on Intel® MIC Architecture AND most of these optimizations are also applicable to host applications. This chapter includes topics such as the floating-point model, prefetching, use of streaming-stores, etc. This is a good chapter for users still not seeing their desired performance OR are looking for the last level of performance enhancements.
Goals for this chapter are to explore a variety of advanced optimizations to determine which may be useful for your application:
It is essential that you read this guide from start to finish using the built-in hyperlinks to guide you along a path to a successful port and tuning of your application(s) on Intel® Xeon Phi™architecture. The paths provided in this guide reflect the steps necessary to get best possible application performance.
The next chapter, The Native and Offload Programming Models, presents a variety of programming models and data considerations to help you get the most performance out of The Intel® Many Integrated Core Architecture (Intel® MIC Architecture)
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804