Using Intel TBB with IPP

Using Intel TBB with IPP

Portrait de caosun

I am trying to use TBB and IPP together to gain speed performance.
I use TBB todo filtering with IPPfunction "ippsFIR_32fc", each thead works on portion of data. But the results are quite strange. I can see a lot of glitch (very large values)into the output data.

The code is as following:

parallel_for(tbb::blocked_range (0, inV.Length, inV.Length/1.5), tbb_parallel_fir_task((Ipp32fc *)inV.Data, filterCoefCP, filterVP->Length, (Ipp32fc *)outVP->Data, m_stateP));

void operator() (const blocked_range& r) const
{

Int begin = r.begin();

Int end = r.end();

Int nIters = end - begin;

ippsFIR_32fc(m_inP + begin, m_outP + begin, nIters, m_stateP);

}

If I remove the IPP function "ippsFIR_32fc" with "ippsCopy_32f", the multiple thread copy functionality works fine.

Note: I have already use function 'ippSetNumThreads(1)' to set IPP internal OpenMP threads number to 1.
Could you please help me?

9 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.
Portrait de Anton Malakhov (Intel)
You should use serial IPP functions with TBB.. I don't know details however.

A side note: what's a strange grainsize "inV.Length/1.5"! Do you want to parallel only on two threads?

Portrait de Kirill Rogozhin (Intel)

caosun, can you duplicate the quesion to Intel IPP forum?
http://software.intel.com/en-us/forums/intel-integrated-performance-primitives/

Portrait de caosun

You are right, I just want to see how it works on two thread.

Portrait de Raf Schietekat

"You are right, I just want to see how it works on two thread."
Just for now, or do you have a valid motivation? In general it is not advised to (ab)use grainsize to specify the number of chunks (executed subranges), be that O(1) or O(available parallelism).

Portrait de Sergey Kostrov
Quoting Raf Schietekat "You are right, I just want to see how it works on two thread."
Just for now, or do you have a valid motivation? In general it is not advised to (ab)use grainsize to specify the number of chunks (executed subranges), be that O(1) or O(available parallelism).

Unfortunately, 'caosun' user didn't follow up.The problem was related to a re-usingan FIRstate variable, that is
passed tothe ippiFIR*function,by many TBBthreads...

There is a thread in IPP forum and it looks like 'caosun' resolved the problem.

Portrait de caosun

Thank you all for your information.The thread issue is solved in IPP forum. Sergey, could you please answer me the question: why there is no IPP filter operation:complex input data and real filter coefficients.

Portrait de Sergey Kostrov
Quoting caosun Thank you all for your information.The thread issue is solved in IPP forum.
[SergeyK] Thank you for the confirmation.

Sergey, could you please answer me the question: Why there is no IPP filter operation:complex input data and real filter coefficients.

Sorry, I don't know. It is possible that IPP team was busy with implementing a multithreaded support in IPP libraryinstead of
implementing additional features related to Image and Digital SignalProcessing, etc.

By the way, do you know that IPP team is considering a new release of IPP library withoutmultithreading?

Best regards,
Sergey

Portrait de caosun

Hi Sergey:

I am interested in the new release of IPP library.

Is there any new features? Could you please give me more information on that?

Thanks.

Best Regards,

Sun Cao

Connectez-vous pour laisser un commentaire.