I'm profiling a numerical model (+OpenMP) using the Intel Vtune advanced hot-spot (default) method.
The results shows that the KMP_ variables : KMP_WAIT_SLEEP ; KMP_X86_PAUSE ; KMP_STATIC_YIELDS, have a non-negligible overhead time.
In addition, another subroutine (PREDICTION) is shown to have a relatively large CPI (Pls. see the attach PNG).
My kind questions to you guys : how can/should I proceed in reducing the over-head time of these KMP_ variables? What is more important, Over-head time or CPI - how to lower the CPI time ? Are there anymore more options that I miss based on the attached results ?
Appreciate in advance your answers,