Determine the level of IA-32 processor-architecture compatibility an application provides. Many applications today must support hardware for at least five years. This is forever in terms of hardware technology, when you consider that five years ago the most common business computer was based on the Intel® Pentium® processor. The first release of the Pentium processor preceded MMX™ technology. Today's processors are on their third generation of specialized multimedia instructions.
Match the multimedia instruction types required by the application to the corresponding processor architecture. The following table shows a list of Intel processors (Processor type and multimedia instruction support) and the Single Instruction Multiple Data (SIMD) instruction sets they support (Instruction Types).
|Pentium® Pro processor||Yes||No||No||No|
|Pentium processor with MMX™ Technology||Yes||Yes||No||No|
|Pentium® II processor||Yes||Yes||No||No|
|Pentium® III processor||Yes||Yes||Yes||No|
|Pentium® 4 processor||Yes||Yes||Yes||Yes|
The next table gives additional detail about instruction types:
|Instruction Type||Comments||Precision (in bits)|
|X87 Instructions||The minimum microprocessor support.||80-bit internal|
|MMX™ Technology||MMX Technology instructions provide integer SIMD support. Eight 64-bit registers. MMX cannot be used at the same time as the floating-point unit. MMX Technology registers are mapped onto the floating-point registers, requiring the EMMS instructions to pass from MMX Technology code to x87 floating-point code; make sure there is a separation of floating-point code from MMX Technology code.
MMX Technology is useful for 2D graphics and blending images. Data should be 8-byte aligned.
|Streaming SIMD Extensions (SSE)||SSE uses eight 128-bit general-purpose registers, each of which can be directly addressed using the register names XMM0 to XMM7. Each register consists of four 32-bit single precision, floating-point numbers, numbered 0 through 3.
SIMD floating-point registers are separate registers, so MMX Technology or floating-point instructions can be mixed with SSE instructions without execution of a special instruction.
|32-bit, single precision|
|Streaming SIMD Extensions (SSE) (cont)||The eight new 128-bit registers can do single-precision floating-point operations. Unlike MMX Technology, these instructions can be used near normal floating-point code. Data alignment is important. Make sure your floating-point vectors are 16-byte aligned. SSE also added pre-fetch instructions.||32 bit, single precision|
|Streaming SIMD Extensions 2 (SSE2)||SSE2 uses the same eight 128-bit registers as SSE. New instructions are added to support double-precision floating-point values and vectors consisting of 2 double-precision components. SSE2 also allows the old MMX Technology instructions to use the new registers, getting rid of the restrictions on MMX Technology and allowing twice as much data to be processed in a single instruction.||64 bit double precision|
So how can you best take advantage of new multimedia instructions? Some compilers do a good job in taking advantage of using multimedia instructions. There is no replacement for hand optimization on specific "hot spots" of code, however. Generally, it is best to optimize for the latest set of multimedia instructions, because it is a superset of other earlier instructions, and the end user can therefore benefit most. Typically, it is best to create two different code paths: one code path for the least common denominator of processors, and one code path for the processor with the latest set of multimedia instructions.
How Much is Optimizing Floating-Point Operations Worth?