Cannot reduce OpenMP threads

Cannot reduce OpenMP threads

I can't get good performance with ippsFir_32f() past the point where it starts using FFTs internally (CORRECTION: that is, past the point where FFTs start getting processed in parallel at order 13). I get about 80% wait time and it's all caused by _kmp_launch_worker threads. 

I've tried

- ippsSetNumThreads(1)
- kmp_set_blocksize(200) via dll import 

Yet I still see multiple kmp threads in Vtune and overall cpu usage is about 75% between 4 cores. What could I be doing wrong here?

publicaciones de 2 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.


Could you attached a simple file that can show how the function is usded and some profiling result?
For high CPU usage, can you add the following APIs, to reduced OpenMP BLOCK time?


Deje un comentario

Por favor inicie sesión para agregar un comentario. ¿No es socio? Únase ya