We are using both the old NSP and the IPP 3.0. We cannot get fir filters to run as fast with IPP as with NSP.
I am typically using the 16 bit data with floating point coefficients.
I am experimenting with using the "direct" form, but simply cannot decipher your documentation. I would like some code examples that would show usage of the direct form for filtering blocks of data.
I am also experimenting with using 32 bit signed coefficients but see no good examples. So I will try taking the float coefficients multiplied by 2^16 with a scale factor of 16 (shift right by 16) of the final result. I'm assuming that I must give up the most significant bits of the 32 bit word for accumulation, but I can't tell from your documentation what size the accumulators are.
If someone can help me out, I'd appreciate it.
cannot get IPP to run faster than old NSP