Note: this article describes the SIMD support in versions 5.3 thru 6.1 of the Intel IPP library. The minimum SIMD requirements have changed with release 7.0 of the Intel IPP library. For more information regarding the SIMD optimization layers present in the Intel IPP 7.0 library please read the article titled Understanding SIMD Optimization Layers and Dispatching in the Intel® IPP 7.0 Library.
The Intel IPP library contains a collection of functionally identical processor-specific optimized libraries that are “dispatched” at run-time. The “dispatcher” chooses which of these processor-specific optimized libraries to use when your application makes a call into the IPP library. This is done to maximize each function’s use of the underlying SIMD instructions and other architecture-specific features.
Note: you can build custom processor-specific libraries that do not require the dispatcher, but that is outside thescope of this article. Please read this IPP linkage models article for information on how to build custom versions of the IPP library.
Dispatching refers to the process of detecting CPU features at run-time and then selecting the Intel IPP optimized library set that corresponds to your CPU. For example, in the \ia32\bin directory, the ippiv8-x.x.dll library file contains version ‘x.x’ of the optimized image processing libraries for Intel® Core™ 2 Duo processors; ‘ippi’ refers to the image processing library, ‘v8’ refers to the Core 2 architecture, and ‘x.x’ refers to the library’s major version numbers.
In the general case, the “dispatcher” identifies the run-time processor only once, at library initialization time. It sets an internal table or variable that directs your calls to the internal functions that match your architecture. For example, ippsCopy_8u(), may have multiple implementations stored in the library, with each version optimized to a specific Intel® processor architecture. Thus, the p8_ippsCopy_8u() version of ippsCopy_8u() is called by dispatcher when running on an Intel Core 2 Duo processor on IA-32, because it is optimized for this processor architecture.
Note: IPP architectures generally correspond to SIMD (MMX, SSE, AES, etc.) instructions sets.
Initializing the IPP Dispatcher
The process of identifying the specific processor being used, and initialization of the dispatcher, should be performed before you make any calls into the IPP library. If you are using a dynamic link library this process is handled automatically when the dynamic link library is initialized. However, if you are using a static library you must perform this step manually. See this article on the ipp*Init*() functions for more information on how to do this.
The following table lists all the architecture codes defined by the Intel IPP library through version 6.1 of the product. Note that some of these IPP architectures have been deprecated and are no longer supported in the current version of the product. Deprecated architectures are identified in the “Notes” column of the table.
|Platform||Architecture||SIMD Requirements||Processor / µarchitecture||Notes|
|IA-32||px||C optimized for all IA-32 processors||i386+|
|a6||SSE||Pentium III||thru 5.3 only|
|w7||SSE2||P4, Xeon, Centrino|
|v8||Supplemental SSE3||Core 2, Xeon® 5100, Atom|
|s8||Supplemental SSE3 (compiled for Atom)||Atom||new in 6.0|
|p8||SSE4.1, SSE4.2, AES-NI||Penryn, Nehalem, Westmere||see notes below|
|g9||AVX||Sandy Bridge µarchitecture||new in 6.1|
|Intel® 64 (EM64T)||mx||C-optimized for all Intel® 64 platforms||P4||SSE2 minimum|
|u8||Supplemental SSE3||Core 2, Xeon® 5100, Atom|
|n8||Supplemental SSE3 (compiled for Atom)||Atom||new in 6.0|
|y8||SSE4.1, SSE4.2, AES-NI||Penryn, Nehalem, Westmere||see notes below|
|e9||AVX||Sandy Bridge µarchitecture||new in 6.1|
|Itanium®||i7||Intel® Itanium® processor family||Itanium|
|IXP4xx||sx||C-optimized for IXP4xx processors||IXP/XScale||thru 5.3 only|
|s2||IXP4xx optimized||IXP/XScale||thru 5.3 only|
For non-Intel based processors support, please see the article titled Use Intel® IPP on Intel or Compatible AMD* Processors.
For more information regarding Intel IPP library support for XScale* processors, please see the following article:
PXA9xx / PXA27x / XScale -- How to get Developer Support
P8/Y8 Internal Run-Time Dispatcher
Within the 32-bit p8 and equivalent 64-bit y8 architectures there is an additional "run-time" dispatching mechanism, a kind of mini-dispatcher. The Nehalem (Intel Core i7) and Westmere processor families add additional SIMD instructions beyond those defined by SSE4.1. The Nehalem processor family adds the SSE4.2 SIMD instructions and the Westmere family adds AES-NI.
Creating two additional internal versions of the IPP library for the SSE4.2 and AES-NI instructions would be very space inefficient, so they are bundled as part of the SSE4.1 library. When you call a function that includes, for example, AES-NI optimizations, an additional jump directs your call to the AES-NI version within the p8/y8 library. Because the enhancements affect the optimization of only a small number of IPP functions, this additional overhead occurs infrequently and only when your application is executing on a p8/y8 architecture processor.
S8/N8 (Atom) Dispatch
The s8/n8 library (Atom-optimized) is not present in the static libraries, only in the dynamic libraries. However, IPP applications built with the static library will run on an Atom processor with very good to equivalent performance using the v8/u8 library (which is automatically dispatched, you do not need to do anything special for the Atom processor).
The Linux distributions of the IPP library include a separate Atom-specific version of the library. However, you do not need to use this Atom-specific library if you are building an IPP application that will be run on multiple processor platforms, including Atom processors. This Atom-only version of the library is provided as a convenience for building IPP applications that will run ONLY on an Atom, as opposed to IPP applications that may run on a variety of processor platforms.
The fundamental difference between the s8/n8 and v8/u8 libraries are the compiler options used to build them, which accommodates the differences in the construction of the instruction pipelines between the Atom and other SSSE3 processors. Both libraries are Supplemental SSE3 libraries and the s8/n8 version of the IPP library does not use any Atom-unique instructions, so no features are lost by running the v8/u8 slice of the library on an Atom processor. Also, these two variations in the library (s8/n8 and v8/u8) give nearly identical performance on an Atom.
Processor Architecture Table
The following table was copied from an Intel Compiler Pro options article describing some compiler architecture options. It contains a list of Intel processors showing which processors support which SIMD instructions. For the latest table please refer to the original article; it gets updated on a regular basis. Please note that the behavior of the Intel Compiler SIMD dispatcher described in that article does not apply to the Intel IPP library.
The Intel IPP library dispatching mechanism behaves differently than that found in the Intel Compiler products, and may also behave differently than other Intel library products.
Additional information regarding dispatching and how it relates to non-Intel processors can be found here. How to identify your specific processor is described here. To correlate a processor family name with an Intel CPU brand name, use the ark.intel.com web site.
Intel® Core™ i7 processors
Intel® Core™ i5 processors
Intel® Core™ i3 processors
Intel® Xeon® 55XX series
Intel® Xeon® 74XX series
Quad-Core Intel® Xeon 54XX, 33XX series
Dual-Core Intel® Xeon 52XX, 31XX series
Intel® Core™ 2 Extreme 9XXX series
Intel® Core™ 2 Quad 9XXX series
Intel® Core™ 2 Duo 8XXX series
Intel® Core™ 2 Duo E7200
Quad-Core Intel® Xeon® 73XX, 53XX, 32XX series
Dual-Core Intel® Xeon® 72XX, 53XX, 51XX, 30XX series
In tel® Core™ 2 Extreme 7XXX, 6XXX series
Intel® Core™ 2 Quad 6XXX series
Intel® Core™ 2 Duo 7XXX (except E7200), 6XXX, 5XXX, 4XXX series
Intel® Core™ 2 Solo 2XXX series
Intel® Pentium® dual-core processor E2XXX, T23XX series
Dual-Core Intel® Xeon® 70XX, 71XX, 50XX Series
Dual-Core Intel® Xeon® processor (ULV and LV) 1.66, 2.0, 2.16
Dual-Core Intel® Xeon® 2.8
Intel® Xeon® processors with SSE3 instruction set support
Intel® Core™ Duo
Intel® Core™ Solo
Intel® Pentium® dual-core processor T21XX, T20XX series
Intel® Pentium® processor Extreme Edition
Intel® Pentium® D
Intel® Pentium® 4 processors with SSE3 instruction set support
Intel® Xeon® processors
Intel® Pentium® 4 processors
Intel® Pentium® M
Intel® Pentium® III Processor
Intel® Pentium® II Processor
Intel® Pentium® Processor
*Other names and brands may be claimed as the property of others.