Cannot reduce OpenMP threads

Cannot reduce OpenMP threads

I can't get good performance with ippsFir_32f() past the point where it starts using FFTs internally (CORRECTION: that is, past the point where FFTs start getting processed in parallel at order 13). I get about 80% wait time and it's all caused by _kmp_launch_worker threads. 

I've tried

- ippsSetNumThreads(1)
- kmp_set_blocksize(200) via dll import 

Yet I still see multiple kmp threads in Vtune and overall cpu usage is about 75% between 4 cores. What could I be doing wrong here?

2 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.


Could you attached a simple file that can show how the function is usded and some profiling result?
For high CPU usage, can you add the following APIs, to reduced OpenMP BLOCK time?


Deixar um comentário

Faça login para adicionar um comentário. Não é membro? Inscreva-se hoje mesmo!