cannot get IPP to run faster than old NSP

cannot get IPP to run faster than old NSP


We are using both the old NSP and the IPP 3.0. We cannot get fir filters to run as fast with IPP as with NSP.

I am typically using the 16 bit data with floating point coefficients.

I am experimenting with using the "direct" form, but simply cannot decipher your documentation. I would like some code examples that would show usage of the direct form for filtering blocks of data.

I am also experimenting with using 32 bit signed coefficients but see no good examples. So I will try taking the float coefficients multiplied by 2^16 with a scale factor of 16 (shift right by 16) of the final result. I'm assuming that I must give up the most significant bits of the 32 bit word for accumulation, but I can't tell from your documentation what size the accumulators are.

If someone can help me out, I'd appreciate it.

6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.


As the first point, do you know that IPP v4.0 is available now? Why do you use the old version?


I spent three days of work about a year ago converting all of my code to switch from NSP to IPP because I was told by a coworker that the IPP is faster. When I completed my effort, I found out that it was about 20% slower. Experts poured over my code and could not determine where I made a mistake. Currently, my code supports NSP and IPP versions dynamically, but the switch is always set to NSP.

I would like to make some experiments, as I outlined, to see if I can improve the situation. So I ask you:

  • would you rush to install the latest code, if the last time you tried the "latest" code, it ran slower?
  • does the latest version come with upgraded documentation that would explain better how to use the features?

Personally, when I first used the NSP, I found the documentation to be excellent. I find the IPP documentation average, in lacking in many areas.

But I digress - do you have any advice or answers to my original questions?

Thanks in advance

We continuously improve functionality and performance of IPP libraries. So, I hope the new version, is always better than the old one. At least we alwayswork to make it happen. I willlook what examples for FIR filters we have and return back.

By the way, what processor do you use to run your code?


Thanks and I understand your point.

The way I figure is that if I can run these experiments, they will work in the new version as well, and I can properly benchmark the performance increases.

We use a variety of Intel processors in our products. Previously, we used an 866 MHz Pentium III (where I could get about 1 GFLOP of performance with the NSP)

We now use a 2.53 GHz Pentium 4.

I have compared NSP and IPP on both processors.

Do you have any ideas how much faster fixed point filters (like the 32 bit coefficient, 16 bit data) would run compared to floating point on this processor? Also, how much faster the direct form might run?


Did you look at performance data we are providing with the IPP? Please find it in your IPP installation tree, it should be something like this: IPP oolsperfsysdata


Leave a Comment

Please sign in to add a comment. Not a member? Join today