Generic Static Library Dispatching with the Intel® IPP 7.0 Library

The px_/mx_ prefixes have been restored to the static generic library (as of version 7.0.4) so that you can now link against the static generic library and the the standard Intel IPP product library within a single application.

Note: The feature described in this article is only relevant if you need to deploy your Intel IPP application on platforms that do not support at least SSE2 for an IA-32 (32-bit) application or at least SSE3 for an Intel64 (64-bit) platform.

If all of your platforms support at least SSE2 (32-bit) or SSE3 (64-bit) you do not need to use the procedure described in this article and you do not need to download the generic px/mx static library!

If you are unsure what level of SIMD instructions your target platform(s) support, please visit ark.intel.com and search for your specific processor(s).

Unlike the dynamic library, the automatic dispatcher in the static library will not recognize the generic library and will not automatically dispatch to the generic optimizations provided in the generic px/mx add-on static library; instead, you must call the generic functions directly using the px_/mx_ prefix (as if you were calling an optimized library function directly). This means that if you choose to include the generic static library as part of your application you must decide whether to call the dispatched library or the equivalent generic library function at each Intel IPP function call within your application. Such a decision should be based on an initial evaluation of the platform that determines if you need to use the generic static library functions or if it is safe to call the standard dispatched library functions.

Note: The ippInit() function normally used to initialize the static library dispatcher determines the level of SSE instructions supported on the target processor at runtime using the CPUID instruction.

The manufacturer string returned by the CPUID instruction is not used as part of this test; however, the CPUID results are interpreted according to Intel processor conventions.

This means that if a non-Intel processor reports the SIMD instructions it supports in a way that is compatible with an Intel processor, the test passes (assuming the reported SIMD level is supported by the library); if not, the test fails. It is believed, but cannot be proven, that all x86-compatible processors report their support for SSE2 and SSE3 in a manner that is compatible with Intel processors. After SSE3 (e.g., SSSE3, SSE4.1, etc.) the SIMD instruction sets in use diverge across manufacturers and are, generally, not compatible with the Intel SSE (and AVX) instructions.

Additionally, at this time we are not planning to restore the generic optimization library as an integral dispatched layer within the Intel IPP product. We periodically must make some difficult choices regarding what we can continue to optimize, test and validate. Given that the SSE2 instruction set has been supported by nearly every x86-compatible processor produced for nearly a decade, the number of platforms that cannot run an application that employs the Intel IPP 7.0 library today is very, very small. The generic px/mx layers are still integrated in the 6.1 version of the Intel IPP library.

Please refer to our Optimization Notice for more information regarding performance and optimization choices in Intel software products.

Calling the Generic PX/MX Functions in an Application

With this version of the px/mx generic add-on static library you can now call the generic functions within the same application as you call the dispatched functions. You must, however, implement an additional layer that "manually dispatches" between the generic functions and the standard functions, since the static library dispatcher cannot, for technical reasons, be integrated with the generic px/mx static add-on library. (This is not an issue with the standard dynamic library.)

The basic idea is best shown by a simple example for use with the px (32-bit) version of the generic static library:

#include "ipp.h"
#include "ipp_generic.h"

Ipp64u ipp_cpuid = 0 ;
IppStatus ipp_init_status = ippInit() ;

// determine processor type/status and set "ipp_cpuid"
// see SIMD detection example further in article...

char src[] = "to be copied\0" ;
char dst[256] ;

if( ipp_cpuid < ippCPUID_SSE2 )
status = px_ippsCopy_8u( src, dst, strlen(src)+1 ) ;
else 
status = ippsCopy_8u( src, dst, strlen(src)+1 ) ;


In the example above, during initialization you must determine whether the application should use the “generic” px code or the standard library. If the runtime processor only supports SIMD instructions less than SSE2 (for example, the processor only supports MMX or SSE) the application calls the generic px functions; otherwise, it calls the standard library functions.

If you are writing a 64-bit application you use the mx prefix on the generic function call and the conditional check is against ippCPUID_SSE3, since SSE3 is the minimum level supported by the standard dispatched library (SSE2 is the minimum level supported by the 32-bit library).

Building Your Generic Include File

The example above includes a file called "ipp_generic.h," which is not distributed with either the standard product or the add-on generic library. You must build this include file for use with your application.

For example, assume that you are using all of the ippsCopy() functions. In that case you would copy from the standard ipps.h header file the function declarations for the ippsCopy() functions you are using. Then add "px_" (or "mx_" for 64-bit applications) to the name of each function declaration. This will provide you with the external function declarations you need in order to call the generic functions. In this case, ipp_generic.h would look like:

IPPAPI(IppStatus, px_ippsCopy_8u,( const Ipp8u* pSrc, Ipp8u* pDst, int len ))
IPPAPI(IppStatus, px_ippsCopy_16s,( const Ipp16s* pSrc, Ipp16s* pDst, int len ))
IPPAPI(IppStatus, px_ippsCopy_16sc,( const Ipp16sc* pSrc, Ipp16sc* pDst, int len ))
IPPAPI(IppStatus, px_ippsCopy_32f,( const Ipp32f* pSrc, Ipp32f* pDst, int len ))
IPPAPI(IppStatus, px_ippsCopy_32fc,( const Ipp32fc* pSrc, Ipp32fc* pDst, int len ))
IPPAPI(IppStatus, px_ippsCopy_64f,( const Ipp64f* pSrc, Ipp64f* pDst, int len ))
IPPAPI(IppStatus, px_ippsCopy_64fc,( const Ipp64fc* pSrc, Ipp64fc* pDst, int len ))
IPPAPI(IppStatus, px_ippsCopy_32s,( const Ipp32s* pSrc, Ipp32s* pDst, int len ))
IPPAPI(IppStatus, px_ippsCopy_32sc,( const Ipp32sc* pSrc, Ipp32sc* pDst, int len ))
IPPAPI(IppStatus, px_ippsCopy_64s,( const Ipp64s* pSrc, Ipp64s* pDst, int len ))
IPPAPI(IppStatus, px_ippsCopy_64sc,( const Ipp64sc* pSrc, Ipp64sc* pDst, int len ))

Make sure you include ipp.h before you include your custom ipp_generic.h file, so all macro and data type definitions have been taken care of before you declare your generic functions.

Of course, you must also be sure to include the appropriate generic library in the list of libraries that your application will link against. The "USE_IPP" feature that does this automatically for you in Microsoft* Visual Studio* WILL NOT do this for you!

The ZIP file attached to this KB article is an example of how you can setup your ipp_generic.h file automatically using a C macro redifinition. The ZIP file also includes a simple test application.

Determining What Level of SIMD Your Processor Supports

To determine if your processor will be supported by the standard Intel IPP 7.0 library you can use the following test:

Ipp64u u64FeaturesMask = ippCPUID_GETINFO_A ;
Ipp32u u32CpuidInfoRegs[] = { 1, 0, 0, 0 } ;
IppStatus ipp_status ;

if( ipp_status = ippGetCpuFeatures( &u64FeaturesMask, u32CpuidInfoRegs ) )
/* handle error condition returned by status */ ;

ipp_cpuid = u64FeaturesMask & 0x1ff ;

The contents of ipp_cpuid can be compared against the "CPU Features Mask" enumerations to determine which level of SIMD instructions are supported (see the sample code earlier in this article). A complete table of "CPU Features Mask" enumerations is provided here:

/sites/products/documentation/hpc/composerxe/en-us/2011Update/ippxe/ipp_manual_lnx/hh_goto.htm

The definition of the "CPU Features Mask" is located inside the ippdefs.h header file.

This is not the only method available to determine the SIMD instructions supported by your runtime processor, there are other methods, such as your compiler's cpuid intrinsic; this is just one example.

Optimization Notice in English

Etiquetas:
AdjuntoTamaño
Descargar ipp-generic.zip870 bytes
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.