How to Use Intel® Math Kernel Library ScaLAPACK library on Intel® Xeon Phi™ Coprocessors

Intel® Math Kernel Library (Intel® MKL) provides highly optimized, extensively threaded computing math routines for applications that require performance on latest Intel® processors, including Intel® Xeon Phi™ coprocessor. See [1] [2]. This article gives an introduction on how to use Intel MKL ScaLAPACK library on Intel Xeon Phi coprocessors. A test code is attached.


  1. Intel® MKL 11.0 or higher
    ScaLAPACK routines are tuned for the Intel Xeon Phi Architecture based on the Intel® Many Integrated Core Architecture (Intel® MIC Architecture) for Intel since Intel MKL 11.0 or higher.
  2. Intel® C++ or Fortran compilers version 12.1 or higher
    Needed for compiling applications for Intel Xeon Phi coprocessor
  3. Intel® MPI 4. 1 or higher [3]
    Intel MKL ScaLAPACK supports only Intel® MPI Library on Intel Xeon Phi coprocessors.
  4. Test platform (include hardware and software)
    Intel MKL support host processors based on Intel® 64 or compatible architectures and IntelXeon Phi Coprocessors. For example, for this article we use:
    • Host machine: Intel® Xeon® CPU E5-2680 @ 2.70GHz, 2 CPU x 8core = 16 threads available, Red Hat Enterprise Linux* Server release 6.2 64 bits, kernel 2.6.32-220.el6.x86_64.
    • Intel Xeon Phi Coprocessor 7110P, 61 cores, 1.09GHz, 8GB GDDR5 memory with MPSS 3.1

On the host machine: Intel MKL 11. 1, Intel MPI 4. 1.2, Intel® Fortran Compiler 14.0, Intel® C++ Compiler 14.0.

Please note the mkl libraries are installed to directory /opt/intel/composer_xe_2013_sp1.1.106/mkl/lib and it includes 3 subdirectories.
$ ls /opt/intel/composer_xe_2013_sp1.1.106/mkl/lib
ia32 intel64 mic


Step 1. Prepare ScaLAPACK example code
Here is a C example code pblas3_d_example.c, which is from previous Intel MKL version. It calls PDGEMM (PBLAS) and PDLANGE (ScaLAPACK). Please copy it to your working directory.
 $ mkdir ~/scalapack_mic
 $ cp pblas3_d_example.c ~/scalapack_mic/.

Please note Intel MKL provides code examples in install directory. You can find one example code there freely.

Step2. Compiling the example code
Differing from other Intel MKL routines, Intel MKL ScaLAPACK Library supports two programming models on Intel Xeon Phi coprocessors:

  • Native model: in this native mode, the application runs solely on the coprocessors. The application can be launched from the host or the coprocessor.
  • Symmetric model: in this mode, the application runs on both the host and the coprocessors simultaneously. Coprocessor worked as a regular node of the heterogeneous cluster consisted of the host and coprocessors. Sometimes the mode is named Hybrid model

Building steps for the native model:
Set up the proper environment settings for the compiler and for the Intel MPI Library for Intel Xeon Phi Coprocessor
 $ source /opt/intel/composer_xe_2013_sp1/bin/ intel64
 $ source /opt/intel/impi/4.1.2/bin64/

Build the application for the coprocessor:
$ mpiicc -mmic pblas3_d_example.c -o pblas3.mic -I/opt/intel/composer_xe_2013_sp1.1.106/mkl/include /opt/intel/composer_xe_2013_sp1.1.106/mkl/lib/mic/libmkl_scalapack_lp64.a -Wl,--start-group /opt/intel/composer_xe_2013_sp1.1.106/mkl/lib/mic/libmkl_intel_lp64.a /opt/intel/composer_xe_2013_sp1.1.106/mkl/lib/mic/libmkl_intel_thread.a /opt/intel/composer_xe_2013_sp1.1.106/mkl/lib/mic/libmkl_core.a /opt/intel/composer_xe_2013_sp1.1.106/mkl/lib/mic/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -openmp

The pblas3.mic executable will be produced.
For native mode, skip to Step 3.

Build the application for the host
$mpiicc pblas3_d_example.c -o -I/opt/intel/composer_xe_2013_sp1.1.106/mkl/include /opt/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group /opt/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_lp64.a /opt/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_thread.a /opt/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_core.a /opt/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -openmp

The executable will be produced.

Step 3. Run the application
Before run the application, we need to set up run-time environment as Ref[3] on coprocessors
 scp /opt/intel/impi/* mic0:/bin/
 scp /opt/intel/impi/* mic0:/lib64/
 scp /opt/intel/composer_xe_2013_sp1.1.106/compiler/lib/mic/ mic0:/lib64/

Native mode:
Upload the application pblas3.mic to the /tmp directory on the coprocessors using the scp command.
 $ scp pblas3.mic mic0:/tmp/.
 $ ssh mic0
 $ cd /tmp
 $ mpirun -n 4 ./pblas3.mic
The command spawns 4 MPI processes on a coprocessor.

Symmetric model:
Upload the application pblas3.mic to the /tmp directory on the coprocessors.
Enable the MPI communication between host and coprocessors and use the appropriate script to set up the runtime environment:
 $ source /opt/intel/composer_xe_2013_sp1/bin/ intel64
 $ source /opt/intel/impi/4.1.2/bin64/

 $export I_MPI_MIC=enable
 $mpirun -n 2 -host localhost ./ : -n 2 -host mic0 /tmp/pblas3.mic
The command spawns 2 MPI processes on host machine and 2 MPI processes on a coprocessor.

 $mpirun -n 2 -genv I_MPI_DEBUG 2 -genv I_MPI_FABRICS tcp -host localhost ./ : -n 2 -genv I_MPI_DEBUG 2 -host mic0 /tmp/pblas3.mic


This article shows how to compile and run a simple Intel MKL ScaLAPACK application on Intel Xeon Phi coprocessor. Besides the native model, Intel MKL ScaLAPACK can benefit from another usage model offered by the Intel MPI Library. We call it as symmetric model or hybrid model, which treats each Intel Xeon Phi coprocessor as a regular node in a heterogeneous cluster consisted of Intel Xeon processors and Intel Xeon Phi coprocessors. Symmetric model allows developers to run ScaLAPACK on both host and coprocessors. Check more information on Intel MKL ScaLAPACK in the Intel MKL User guide and Intel MPI Library User guide.

[1] Online articles: Using Intel MKL on Intel Xeon Phi Coprocessors

[2] Intel MKL User Guide

[3] Intel MPI Library reference and online doc.

Para obter informações mais completas sobre otimizações do compilador, consulte nosso aviso de otimização.
Há downloads disponíveis para a licença Intel Sample Source Code License Agreement. Faça o download agora