I am trying to use TBB and IPP together to gain speed performance.
I use TBB todo filtering with IPPfunction "ippsFIR_32fc", each thead works on portion of data. But the results are quite strange. I can see a lot of glitch (very large values)into the output data.
The code is as following:
parallel_for(tbb::blocked_range (0, inV.Length, inV.Length/1.5), tbb_parallel_fir_task((Ipp32fc *)inV.Data, filterCoefCP, filterVP->Length, (Ipp32fc *)outVP->Data, m_stateP));
void operator() (const blocked_range& r) const
Int begin = r.begin();
Int end = r.end();
Int nIters = end - begin;
ippsFIR_32fc(m_inP + begin, m_outP + begin, nIters, m_stateP);
If I remove the IPP function "ippsFIR_32fc" with "ippsCopy_32f", the multiple thread copy functionality works fine.
Another question is: For float point function, I did not see this type of FIR: complex input data and real filter coefficients. I indeed see complex input data and complex filter coefficients OR real input data and real filter coefficients.
Note: I have already use function 'ippSetNumThreads(1)' to set IPP internal OpenMP threads number to 1.
Could you please help me?