Intel® IPP Functions Optimized for Intel® Advanced Vector Extensions 512 (Intel® AVX-512)

Below is the list of Intel® Integrated Performance Primitives (Intel® IPP) functions that are optimized for Intel® Advanced Vector Extensions 512 (Intel® AVX-512). Among these functions, about 200 functions are optimized both for Intel® Xeon Phi™ processor x200 (formerly Knights Landing) and for Intel® Xeon® processor code name Skylake, and some other functions are optimized for Intel® Xeon® processor code name Skylake only.  Please notes the table below represents the Intel IPP library functions that have been "hand-tuned" for optimal performance. Intel IPP functions that are not listed here can also get optimization benefit from Intel® Compiler. 

Optimization Notice

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804

Intel® AVX-512 optimization for both Intel® Xeon Phi™ processor x200 (codename Knights Landing) and Intel® Xeon® processor (codename Skylake)

ippiResizeNearest_32f_C1RippiMean_32f_C1R
ippiResizeLanczos_32f_C1RippiMean_32f_C3R
ippiResizeCubic_32f_C1RippiMean_32f_C4R
ippiResizeSuper_32f_C1RippiNorm_Inf_32f_C1R
ippiResizeLinear_32f_C1RippiNorm_Inf_32f_C3R
ippiFilterBorder_32f_C1RippiNorm_Inf_32f_C4R
ippiFilter_64f_C1RippiNorm_L1_32f_C1R
ippiAdd_32f_C1RippiNorm_L1_32f_C3R
ippiSub_32f_C1RippiNorm_L1_32f_C4R
ippiMul_32f_C1RippiNorm_L2_32f_C1R
ippiDiv_32f_C1RippiNorm_L2_32f_C3R
ippiAddC_32f_C1RippiNorm_L2_32f_C4R
ippiSubC_32f_C1RippiNormRel_Inf_32f_C1R
ippiMulC_32f_C1RippiNormRel_L1_32f_C1R
ippiDivC_32f_C1RippiNormRel_L2_32f_C1R
ippsAdd_32fcippiNormDiff_Inf_32f_C1R
ippsAdd_64fcippiNormDiff_Inf_32f_C3R
ippsSub_32fcippiNormDiff_Inf_32f_C4R
ippsSub_64fcippiNormDiff_L1_32f_C1R
ippsMul_32fcippiNormDiff_L1_32f_C3R
ippsMul_64fcippiNormDiff_L1_32f_C4R
ippsDiv_32fcippiNormDiff_L2_32f_C1R
ippsDiv_64fcippiNormDiff_L2_32f_C3R
ippsAddC_32fcippiNormDiff_L2_32f_C4R
ippsAddC_64fcippiSwapChannels_32f_C3C4R
ippsSubC_32fcippiSwapChannels_32f_C4C3R
ippsSubC_64fcippiSwapChannels_32f_C3R
ippsMulC_32fcippiSwapChannels_32f_AC4R
ippsMulC_64fcippiCopy_32f_AC4C3R
ippsDivC_32fcippiCopy_32f_P3C3R
ippsDivC_64fcippiAbsDiff_32f_C1R
ippiGradientVectorSobel_32f_C1RippiMinMaxIndx_32f_C1MR
ippiGradientVectorScharr_32f_C1RippiMinMaxIndx_32f_C1R
ippiGradientVectorPrewitt_32f_C1RippiFilterRowBorderPipeline_32f_C1R
ippiThreshold_32f_C1RippiFilterRowBorderPipeline_32f_C3R
ippiThreshold_LT_32f_C1RippiFilterScharrVertMaskBorderGetBufferSize
ippiThreshold_Val_32f_C1RippiFilterScharrVertMaskBorder_32f_C1R
ippiThreshold_GT_32f_C1RippiFilterScharrHorizBorder_32f_C1R
ippiThreshold_GTVal_32f_C1RippiFilterSobelNegVertBorder_32f_C1R
ippiThreshold_LTVal_32f_C1RippiFilterSobelHorizBorder_32f_C1R
ippiThreshold_LTValGTVal_32f_C1RippiFilterSobelVertSecondBorder_32f_C1R
ippiCopyReplicateBorder_32f_C1RippiFilterSobelHorizSecondBorder_32f_C1R
ippiCopyConstBorder_32f_C1RippiFilterScharrHorizMaskBorder_32f_C1R
ippiCopyMirrorBorder_32f_C1RippiNorm_Inf_32f_C1MR
ippiSet_32f_C1RippiNorm_L1_32f_C1MR
ippiMirror_32f_C1RippiNorm_L2_32f_C1MR
ippiWarpAffineNearest_32f_C1RippiNorm_Inf_32f_C3CMR
ippiWarpAffineLinear_32f_C1RippiNorm_L1_32f_C3CMR
ippiWarpAffineCubic_32f_C1RippiNorm_L2_32f_C3CMR
ippiSqrt_32f_C1RippiNormRel_Inf_32f_C1MR
ippiFilterMedianBorder_32f_C1RippiNormRel_L1_32f_C1MR
ippiFilterMaxBorder_32f_C1RippiNormRel_L2_32f_C1MR
ippiFilterMinBorder_32f_C1RippiNormDiff_Inf_32f_C1MR
ippiDCT8x8Fwd_32f_C1ippiNormDiff_L1_32f_C1MR
ippiConv_32f_C1RippiNormDiff_L2_32f_C1MR
ippiDCT8x8Inv_32f_C1ippiNormDiff_Inf_32f_C3CMR
ippiRemap_32f_C1RippiNormDiff_L1_32f_C3CMR
ippiHistogram_32f_C1RippiNormDiff_L2_32f_C3CMR
ippiCompare_32f_C1RippiDotProd_32f64f_C1R
ippiFilterColumnPipeline_32f_C1RippiDotProd_32f64f_C3R
ippsConvolve_32fippiDotProd_32f64f_C4R
ippsConvolve_64fippiCrossCorrNorm_32f_C1R
ippsIIR_32fippiDilateBorder_32f_C1R
ippsIIR_64fippiDilateBorder_32f_C3R
ippsIIR_32fcippiDilateBorder_32f_C4R
ippsIIR_64fcippiErodeBorder_32f_C1R
ippsFIRLMS_32fippiErodeBorder_32f_C3R
ippsFIRSR_32fippiErodeBorder_32f_C4R
ippsFIRSR_64fippiFilterBoxBorder_32f_C1R
ippsFIRSR_32fcippiFilterBoxBorder_32f_C3R
ippsFIRSR_64fcippiFilterBoxBorder_32f_C4R
ippsCrossCorrNorm_32fippiFilterGaussianBorder_32f_C1R
ippsCrossCorrNorm_64fippiAdd_32f_C1IMR
ippsCrossCorrNorm_32fcippiAdd_32f_C1IR
ippsCrossCorrNorm_64fcippiAddC_32f_C1IR
ippsAutoCorrNorm_32fippiAddProduct_32f_C1IMR
ippsAutoCorrNorm_64fippiAddProduct_32f_C1IR
ippsAutoCorrNorm_32fcippiAddSquare_32f_C1IMR
ippsAutoCorrNorm_64fcippiAddSquare_32f_C1IR
ippiMaxEvery_32f_C1RippiAddWeighted_32f_C1IMR
ippiMinEvery_32f_C1RippiAddWeighted_32f_C1IR
ippiSum_32f_C1RippiScaleC()
ippiSum_32f_C3R 
ippiSum_32f_C4R 

 The functions are optimized for Intel® Xeon® processor code name Skylake only

ippiAdd_8u_C1RSfsippiThreshold_LT_8u_C1R
ippiAdd_16s_C1RSfsippiThreshold_LT_16s_C1R
ippiAdd_16u_C1RSfsippiThreshold_LT_16u_C1R
ippiSub_8u_C1RSfsippiThreshold_Val_8u_C1R
ippiSub_16s_C1RSfsippiThreshold_Val_16s_C1R
ippiSub_16u_C1RSfsippiThreshold_Val_16u_C1R
ippiMul_8u_C1RSfsippiThreshold_GT_8u_C1R
ippiMul_16s_C1RSfsippiThreshold_GT_16s_C1R
ippiMul_16u_C1RSfsippiThreshold_GT_16u_C1R
ippiAddC_8u_C1RSfsippiThreshold_GTVal_8u_C1R
ippiAddC_16s_C1RSfsippiThreshold_GTVal_16s_C1R
ippiAddC_16u_C1RSfsippiThreshold_GTVal_16u_C1R
ippiSubC_8u_C1RSfsippiThreshold_LTVal_8u_C1R
ippiSubC_16s_C1RSfsippiThreshold_LTVal_16s_C1R
ippiSubC_16u_C1RSfsippiThreshold_LTVal_16u_C1R
ippiMulC_8u_C1RSfsippiThreshold_LTValGTVal_8u_C1R
ippiMulC_16s_C1RSfsippiThreshold_LTValGTVal_16s_C1R
ippiMulC_16u_C1RSfsippiThreshold_LTValGTVal_16u_C1R
ippiDiv_8u_C1RSfsippiCopy_8u_C1R
ippiDiv_16s_C1RSfsippiCopyReplicateBorder_8u_C1R
ippiDiv_16u_C1RSfsippiCopyReplicateBorder_16s_C1R
ippiDivC_8u_C1RSfsippiCopyReplicateBorder_16u_C1R
ippiDivC_16s_C1RSfsippiCopyConstBorder_8u_C1R
ippiDivC_16u_C1RSfsippiCopyConstBorder_16s_C1R
ippiSqrt_8u_C1RSfsippiCopyConstBorder_16u_C1R
ippiSqrt_16s_C1RSfsippiCopyMirrorBorder_8u_C1R
ippiSqrt_16u_C1RSfsippiCopyMirrorBorder_16s_C1R
ippiDotProd_8u64f_C1RippiCopyMirrorBorder_16u_C1R
ippiDotProd_16u64f_C1RippiSet_16u_C1R
ippiDotProd_16s64f_C1RippiSet_16s_C1R
ippiConvert_8u32f_C1RippiSet_8u_C1R
ippiConvert_16s32f_C1RippiMirror_8u_C1R
ippiConvert_16u32f_C1RippiMirror_16u_C1R
ippiConvert_32f8u_C1RSfsippiMirror_16s_C1R
ippiConvert_32f16s_C1RSfsippiWarpAffineNearest_8u_C1R
ippiConvert_32f16u_C1RSfsippiWarpAffineNearest_16s_C1R
ippiBinToGray_1u8u_C1RippiWarpAffineNearest_16u_C1R
ippiBinToGray_1u16u_C1RippiWarpAffineLinear_8u_C1R
ippiBinToGray_1u16s_C1RippiWarpAffineLinear_16s_C1R
ippiBinToGray_1u32f_C1RippiWarpAffineLinear_16u_C1R
ippiGrayToBin_8u1u_C1RippiWarpAffineCubic_8u_C1R
ippiGrayToBin_16u1u_C1RippiWarpAffineCubic_16s_C1R
ippiGrayToBin_16s1u_C1RippiWarpAffineCubic_16u_C1R
ippiGrayToBin_32f1u_C1RippiNormRel_Inf_8u_C1R
ippiResizeNearest_8u_C1RippiNormRel_Inf_16u_C1R
ippiResizeNearest_16u_C1RippiNormRel_Inf_16s_C1R
ippiResizeNearest_16s_C1RippiNormRel_L1_8u_C1R
ippiResizeLanczos_8u_C1RippiNormRel_L1_16u_C1R
ippiResizeLanczos_16u_C1RippiNormRel_L1_16s_C1R
ippiResizeLanczos_16s_C1RippiNormRel_L2_8u_C1R
ippiResizeCubic_8u_C1RippiNormRel_L2_16u_C1R
ippiResizeCubic_16u_C1RippiNormRel_L2_16s_C1R
ippiResizeCubic_16s_C1RippiFilterMedianBorder_8u_C1R
ippiResizeSuper_8u_C1RippiFilterMedianBorder_16u_C1R
ippiResizeSuper_16u_C1RippiFilterMedianBorder_16s_C1R
ippiResizeSuper_16s_C1RippiFilterMaxBorder_8u_C1R
ippiResizeLinear_8u_C1RippiFilterMaxBorder_16u_C1R
ippiResizeLinear_16u_C1RippiFilterMaxBorder_16s_C1R
ippiResizeLinear_16s_C1RippiFilterMinBorder_8u_C1R
ippiFilterBorder_8u_C1RippiFilterMinBorder_16u_C1R
ippiFilterBorder_16u_C1RippiFilterMinBorder_16s_C1R
ippiFilterBorder_16s_C1RippiFilterBoxBorder_8u_C1R
ippiGradientVectorSobel_8u16s_C1RippiFilterBoxBorder_16u_C1R
ippiGradientVectorSobel_16u32f_C1RippiFilterBoxBorder_16s_C1R
ippiGradientVectorSobel_16s32f_C1RippiConv_16s_C1R
ippiGradientVectorScharr_8u16s_C1RippiConv_8u_C1R
ippiGradientVectorScharr_16u32f_C1RippiMean_8u_C1R
ippiGradientVectorScharr_16s32f_C1RippiMean_16u_C1R
ippiGradientVectorPrewitt_8u16s_C1RippiMean_16s_C1R
ippiGradientVectorPrewitt_16u32f_C1RippiDilateBorder_1u_C1R
ippiGradientVectorPrewitt_16s32f_C1RippiDilateBorder_16u_C1R
ippiGradientVectorSobel_8u16s_C3C1RippiErodeBorder_16u_C1R
ippiGradientVectorSobel_16u32f_C3C1RippiErodeBorder_16s_C1R
ippiGradientVectorSobel_16s32f_C3C1RippiErodeBorder_1u_C1R
ippiGradientVectorScharr_8u16s_C3C1RippiFilterRowBorderPipeline_8u16s_C1R
ippiGradientVectorScharr_16u32f_C3C1RippiFilterRowBorderPipeline_16s_C1R
ippiGradientVectorScharr_16s32f_C3C1RippiFilterRowBorderPipeline_16u_C1R
ippiGradientVectorPrewitt_8u16s_C3C1RippiFilterColumnPipeline_16s_C1R
ippiGradientVectorPrewitt_16u32f_C3C1RippiFilterColumnPipeline_16u_C1R
ippiGradientVectorPrewitt_16s32f_C3C1RippiFilterColumnPipeline_16s8u_C1R
ippiThreshold_8u_C1RippiFilterMedianBorder_32f
ippiThreshold_16s_C1Ripp_MulPack_32f_C3R
ippiThreshold_16u_C1R 
For more complete information about compiler optimizations, see our Optimization Notice.