I am trying to use TBB and IPP together to gain speed performance.

I use TBB todo filtering with IPPfunction "ippsFIR_32fc", each thead works on portion of data. But the results are quite strange. I can see a lot of glitch (very large values)into the output data.

The code is as following:

parallel_for(tbb::blocked_range (0, inV.Length, inV.Length/1.5), tbb_parallel_fir_task((Ipp32fc *)inV.Data, filterCoefCP, filterVP->Length, (Ipp32fc *)outVP->Data, m_stateP));

void operator() (const blocked_range& r) const

{

Int begin = r.begin();

Int end = r.end();

Int nIters = end - begin;

ippsFIR_32fc(m_inP + begin, m_outP + begin, nIters, m_stateP);

}

If I remove the IPP function "ippsFIR_32fc" with "ippsCopy_32f", the multiple thread copy functionality works fine.

Another question is: For float point function, I did not see this type of FIR: complex input data and real filter coefficients. I indeed see complex input data and complex filter coefficients OR real input data and real filter coefficients.

Note: I have already use function 'ippSetNumThreads(1)' to set IPP internal OpenMP threads number to 1.

Could you please help me?