In the period prior to the launch of Intel® Xeon Phi™ coprocessor, Intel collected questions from developers who had been involved in pilot testing. This document contains some of the most common questions asked.
The Intel® Compiler reference guides can be found at:
The Intel® MPI reference guide and the addendum for the Intel® Many Core architecture (Intel® MIC) architecture can be found at:
The Intel® Math Kernel Libraries (Intel® MKL) reference guide can be found at:
Q) How do I disable automatic offloads for a specific Intel® MKL call, or on a specific coprocessor?
You can disable automatic offloads by calling mkl_mic_disable().
Alternately, you can use mkl_mic_set_workdivision to assign the entire computation to only the host and effectively disable offloads. This can be done as:
You can also set the work division to zero on each coprocessor to completely disable offloads or just one to disable on a specific coprocessor. For e.g.
Q) Which Intel MKL Functional domains are supported on Intel Many Integrated Core (Intel MIC) Architecture?
The Intel MKL 11.0 Update 2 supports the following functional domains on Intel MIC Architecture:
BLAS level 3, and much of level 1 and 2
- Sparse BLAS: CSRMV, CSRMM (Native only)
- Some important LAPACK routines: (LU, QR, Cholesky)
- Fast Fourier Transforms
- Vector Math Library (Native Only)
- Random number generators in the Vector Statistical Libraries
- Poisson Solver
- Iterative Sparse Solvers
- Trust Region Solvers.
- Everything else.
To find more information, please visit this page.
Q) How can I use Intel MKL and third-party applications together?
Articles describing how to use the Intel MKL with other third-party libraries and application such as Numpy*, Scipy*, Matlab*, C#, Python*, and NAG* can be found here.
Q) How can I control the threading in Intel MKL routines?
You can control the parallelization within the Intel MKL routines with by using MKL threading support functions and some environment variables. You can read more at:
Q) How do I run the Intel® MPI Library on the Intel Xeon Phi coprocessor?
You can find more about running Intel MPI on the Intel Xeon Phi coprocessor at
Q) Which Intel MPI-files do I need to transfer to the coprocessor for my MPI application?
Several binaries and libraries need to be transferred to the Intel Xeon Phi coprocessor to execute MPI applications on the coprocessor. You can find more information at:
Q) How can I pin Intel MPI processes on the Intel Xeon Phi coprocessor?
Information on pinning Intel MPI processes on the Intel Xeon Phi coprocessor can be found at:
Q) How do I pin processes and associated threads under Intel MPI in a hybrid MPI-OpenMP model in native mode?
You can pin processes and associated threads using I_MPI_PIN_DOMAIN and KMP_AFFINITY environment variables.
- For 4 or fewer OpenMP threads per MPI process set the variables as follows:
For e.g. to pin 4 threads per process use:
- For more than 4 OpenMP threads per MPI process set the variables as follows:
For e.g. to pin 8 threads per process use:
I_MPI_PIN_DOMAIN=omp; KMP_AFFINITY=compact; OMP_NUM_THREADS=8
In this case, remember to set OMP_NUM_THREADS as a multiple of four, to avoid splitting cores.
Also, setting I_MPI_DEBUG=5 reveals the MPI process affinity map.
You can read more about OpenMP-MPI interoperability at this page.
Q) How do I pin threads within an offload created by an Intel MPI process?
This is the case where you have multiple mpi processes on the host where each process loffloads some work to the Intel Xeon Phi coprocessor. To keep the multiple MPI processes from running their offloaded code on the same threads, careful setting of the OpenMP KMP_AFFINITY environment variable is required. . To pin threads and prevent interference, use a long command line with KMP_AFFINITY proclist settings as shown below:
-env MIC_OMP_NUM_THREADS 10 -env MIC_KMP_AFFINITY granularity=fine,proclist=[1-8],explicit -n 1 ./myApp
: -env MIC_OMP_NUM_THREADS 10 -env MIC_KMP_AFFINITY granularity=fine,proclist=[9-16],explicit -n 1 ./myApp
: -env MIC_OMP_NUM_THREADS 10 -env MIC_KMP_AFFINITY granularity=fine,proclist=[17-24],explicit -n 1 ./myApp
: -env MIC_OMP_NUM_THREADS 10 -env MIC_KMP_AFFINITY granularity=fine,proclist=[24-32],explicit -n 1 ./myApp
In the above example, there is one argument set per MPI process separated by a “:” which is used as separators of different argument sets.
For more information, please visit this page.
Q) Does the Intel Xeon Phi coprocessor have support for Berkeley Lab Checkpoint/Restart (BLCR)?
Currently the Intel Xeon Phi coprocessor has no support for BLCR.