How to control the work division in Intel MKL on Intel Xeon Phi

  In Intel MKL, for the automatic offloaded Level 3 BLAS functions (?GEMM, ?TRMM, ?TRSM),  the computation can be divided among host CPU and Xeon Phi coprocessors by either using an environment variable or by calling a function and allow users to override the default work division decided by Intel MKL runtime.

 The table below gives a few examples showing how to set and manage the division of work between the host and coprocessor(s).

Examples Notes
MKL_MIC_Set_Workdivision(
MKL_TARGET_MIC, -1, 0.5)
Offload 50% of computation to all cards. The runtime system decides which cards to use.
MKL_MIC_Set_Workdivision(  MKL_TARGET_MIC, 0, 0.5) Offload 50% of computation only to the 1st card. 
MKL_MIC_Set_Workdivision(
MKL_TARGET_MIC, 1,
MKL_MIC_AUTO_WORKDIVISION)
Let the runtime to decide how much work to offload to the 2nd card.
MKL_MIC_Set_Workdivision(  MKL_TARGET_HOST, 0, 0.5) Keep 50% of computation on the host, offload the rest to all cards.  (In this case the second argument is ignored)
MKL_MIC_Get_Workdivision(
MKL_TARGET_MIC, 0, &wd)
Find how much work was specified for the 1st card.
MKL_MIC_Get_Device_Count( ) Find how many cards available on the system.

Work division can also be controlled using environment variables. See the examples in the table below.

Note that support functions always take precedence over environment variables.

 

Examples Notes
MKL_MIC_WORKDIVISION=0.5 Offload 50% of computation to all cards. The runtime system decides which cards to use.
MKL_MIC0_WORKDIVISION=0.5 Offload 50% of computation only to the 1st card. 
MKL_MIC1_WORKDIVISION=
MIC_AUTO_WORKDIVISION
Let the runtime to decide how much work to offload to the 2nd card.
MKL_HOST_WORKDIVISION=0.5 Keep 50% of computation on the host, offload the rest to all cards.

Note: These work division controls work differently for LAPACK functions. For LAPACK functions, any non-zero value of work division is interpreted as 100%.

 

Please refer other articles related to Intel MKL on Intel Xeon Phi at Intel® Math Kernel Library on the Intel® Xeon Phi™ Coprocessor

 

 

Einzelheiten zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.