Intel® Integrated Performance Primitives

uncore performance-monitoring events

l am using a machine that have Intel Xeon(R) CPU, x5570 (2.93Hz) on IBM system x3650 M2 server. 
I have proceeded to experiment with manual 
"Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3B: System Programming Guide, Part 2", chapter 19 performance-monitoring events. 

I want to get information uncore event. for example this manual's Table 19-14. "Non-Architectural Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.)". 

64 bit C# wrapper for ipp 6.1

I am still using ipp 6.1 wrapped in C# library ipp_cs. I don't have the need to update the version as of now. At present I am using 32 bit version of C# wrapper. I need to migrate to 64bit version of this library. I couldn't this on the products page. Could someone please advise where can I download the 64bit version C# wrapper for ipp?

Updated from IPP 7.1.1 to 8.2.1, seeing segmentation faults on AVX (e9)

We have been using the Intel IPP's for many years now (Dialogic was once an Intel Company :)).  A few years back we updated to version 7.1.1 and all was well until we ran into some segmentation faults on certain newer systems.  The crashes were on systems which supported AVX and AVX2 processors. We found that we were able to work around this by limiting the CPU type to AVX.

We recently updated to IPP 8.2.1 hoping that this limitation would no longer be required.  However, we are seeing more frequent segmentation faults on systems which support AVX using the e9 IPP functions.

performance issue ippiCrossCorrNorm_8u32f_C1R


I compared the ipp703 call ippiCrossCorrValid_NormLevel_8u32f_C1R

to the 802 call ippiCrossCorrNorm_8u32f_C1R

and measured Timing in endless Loops (all buffers pre-allocated, 1000x1000 Image, 10x10 template)

Results s. Attachment

First Trial: in the Loop a sleep(0) directive was used

The 703 turns out to be 4x faster (!) than the new 802 function, but cpu load is extreme and would not give space to other Tasks in complex applications

Second Trial: in the Loop a sleep(100) directive was used

Converting from pixel order YCbCr411 to BGR

Hi all,

I'm new to using the IPP library and  I'm having trouble converting from pixel order YCbCr411 data to BGR format.  From reading the documentation, I know that I first need to convert the pixel order to planar format first before calling the the library function ippiYCbCr411ToBGR_8u_P3C4R to convert planar YCbCr411 data to BGR format. I see that I can probably use the ippiCopy_8u_C3P3R call to convert my image buffer from pixel order to planar order. 

For example, is this the following the correct approach?

FilterLaplace generates different results when running the 32-Bit vs. 64-Bit

We have observed different results when running the FilterLaplace function between the 32-bit and the 64-bit version of the IPP libraries.
Specifically we are calling: ippiFilterLaplace_8u_AC4R with both 3X3 and 5X5 kernels.
The binary results differ between the 32 and 64 bit libraries.  Is this expected?  We are currently running IPP 7.0.

Many thanks,


ippsDotProd_32f Performance on Haswell CPU


at the moment I'm using ippsDotProd_32f in IPP 7.0 quite extensively in one of my projects. I now tested IPP 8.2 on a Haswell CPU (Xeon e5-2650 v3 in a HP z640 workstation) with this project because I expected it to be significantly faster (see below). Actually, the code was about 10% slower using IPP 8.2 which I found quite disturbing.

订阅 Intel® Integrated Performance Primitives