Optimized Matrix Library for use with the Intel® Pentium® 4 Processor's SSE2 Instructions

Published: 03/27/2012, Last Updated: 03/27/2012


On January 2000, Intel published an optimized matrix library (4D single-precision matrix and vector classes) for use with Pentium® III Streaming SIMD (Single Instruction Multiple Data) Extensions, or SSE, in an article in www.gamasutra.com.

Since then, a new processor was introduced – the Intel® Pentium® 4 processor. Its new SSE2 instructions are devoted to double-precision calculations. While a Pentium® III processor's SSE register holds four single-precision elements, the Pentium® 4 processor's SSE2 register holds two double-precision elements.

Using the new SSE2 instructions, Intel has completed an enhanced version of the optimized matrix library. The new library contains similar classes to those of its successor, and additional classes with the same functionality implemented using double-precision arithmetic.

In this article, we describe the new library and its classes and provide some examples of its use. At the end of the article we provide links to the library itself and other helpful resources, such as a free evaluation copy of the Intel® C/C++ Compiler.

