SSE2 Instructions in a Double-precision 3D Transform

Submit New Article

January 14, 2010 11:00 PM PST



Inroduction

The Streaming SIMD Extensions 2 (SSE2) technology introduces new Single Instruction Multiple Data (SIMD) double-precision floating-point instructions and new SIMD integer instructions into the IA-32 Intel® architecture. The double-precision SIMD instructions extend functionality in a manner analogous to the single-precision instructions introduced with the Streaming SIMD Extensions (SSE). The 128-bit SIMD integer extensions are a full superset of the 64-bit integer SIMD instructions, with additional instructions to support more integer data types, conversion between integer and floating-point data types, and efficient operations between the caches and system memory. These instructions provide a means to accelerate operations typical of 3D graphics, real-time physics, spatial (3D) audio, video encoding/decoding, encryption, and scientific application. This application note describes the implementation of a double-precision 3D geometry transformation, and includes code examples that exploit the SSE2 instructions.

Optimization techniques of several implementations using the 128-bit XMM registers to work with packed double-precision floating-point data and related SSE2 instructions are compared to provide developers the tools for optimizing their own implementations of the double-precision 3D transformation. Source code containing various implementations of the algorithm is included.