MKL thread priorities

MKL thread priorities

Hello All,

In version 11 of the Linux-based MKL, is it possible to either:
1) specify the pthread attributes to be used when creating any MKL threads,
or
2)  have the MKL threads perform a callback to a user specified function to allow a user to override the default thread attributes
     e.g. the thread priority?

If not, will the next version of the Linux MKL support such features?

Thanks in advance,

Bob

7 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.

Bob,

MKL's threading layer is based on OpenMP. You cannot control MKL threads using pthread controls. You can, of course, use pthread to spawn user threads in your application, and call into MKL from each user thread. There are some particularities you need to be careful about, if you're considering this useage model. Please see the discussion here: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/...

Zhang

>>...MKL's threading layer is based on OpenMP. You cannot control MKL threads using pthread controls....

I agree with that and MKL OpenMP threads should have Normal priority. You could to "walk" through the created threads in order to change priorities to Above Normal or High ( performance improvement was ~1% or less ). A change to Realtime is Not recommended and I did Not see any performance improvements ( it was opposite, that is, performance degraded, for very large data sets ) and my tests were done on a Windows platform.

Take into account that Realtime threads will preempt almost all the rest threads including virtual memory management threads.

>>...If not, will the next version of the Linux MKL support such features?

My opinion is that it does not make sense for Intel MKL team to implement such functionality. Try to use a workaround, that is, "walk" the threads and change priorities if you need.

Another thing is that you have Not explained what problems you have at the moment?

Hello all,

Our Linux application requires a matrix decomposition to be performed in real-time, regardless of the background load on any of the CPU cores.
I was hoping that we could use the MKL to distrubute this decomposition across mutiple cores (all running at the same elevated priority).

Thanks

Bob

>>I was hoping that we could use the MKL to distrubute this decomposition across mutiple cores (all running
>>at the same elevated priority).

It is Not clear if you have some performance problems with Normal priorities of OpenMP threads and as I've mentioned before you can raise priorities but this is a tricky task. Another monitoring thread is needed responsible for a change of thread priorities as soon as all OpenMP threads created by MKL functions. Even if it is done ( I mean priorities changed ) don't expect a significant performance boost of your calculations.

Zitat:

Bob J. schrieb:

Our Linux application requires a matrix decomposition to be performed in real-time, regardless of the background load on any of the CPU cores. I was hoping that we could use the MKL to distribute this decomposition across multiple cores (all running at the same elevated priority).

Hello Bob,

A real realtime constraint (this is not a typo ;-) is usually not satisfied by raising the thread priorities. Some people are even arguing that a patched/preemptive Linux kernel is a sufficient solution. There are OS' and Linux flavors that are more suited to satisfy real-time requirements. Sometimes Asymmetric Multiprocessing (AMP) along with a Hypervisor (such as the solution from WindRiver) can be used to isolate code with such requirements while retaining "normal code" in an usual environment. Anyhow, you may have a look at Intel IPP. Intel IPP provides non-pic variants of the libraries that are suitable to run in kernel mode. Anyhow, this requires you to rely on "application-level threading" (well, the kernel mode is not exactly an "application") i.e., to employ your own threading.

If you are just asking for code that is "effectively" running in realtime (perhaps on a certified system), you have quite a few more choices. What about isolating your matrix decomposition into a separate program? This entire program can then execute with realtime priority (perhaps pinned to a certain number of reserved cores). You can then communicate with this "service program" from your normal application (running with regular priority on whatever core[s]). There are many options as well to implement the communication with such a service e.g., you can use a shared arena.

 Hans

Kommentar hinterlassen

Bitte anmelden, um einen Kommentar hinzuzufügen. Sie sind noch nicht Mitglied? Jetzt teilnehmen