On January 2000, Intel published an optimized matrix library (4D single-precision matrix and vector classes) for use with Pentium® III Streaming SIMD (Single Instruction Multiple Data) Extensions, or SSE, in an article in www.gamasutra.com.
Since then, a new processor was introduced – the Intel® Pentium® 4 processor. Its new SSE2 instructions are devoted to double-precision calculations. While a Pentium® III processor's SSE register holds four single-precision elements, the Pentium® 4 processor's SSE2 register holds two double-precision elements.
Using the new SSE2 instructions, Intel has completed an enhanced version of the optimized matrix library. The new library contains similar classes to those of its successor, and additional classes with the same functionality implemented using double-precision arithmetic.
In this article, we describe the new library and its classes and provide some examples of its use. At the end of the article we provide links to the library itself and other helpful resources, such as a free evaluation copy of the Intel® C/C++ Compiler.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804