Using Intel MKL BLAS and LAPACK with PETSc

This document describes how to build the Portable Extensible Toolkit for Scientific Computation (PETSc) with Intel® Math Kernel Library (Intel® MKL) BLAS and LAPACK.

It also introduces how to enable Sparse Linear operation include Sparse BLAS and Intel® MKL PARDISO and Cluster PARDISO as direct solver in PETSc applications.

PETSc ( is a set of libraries that provides functions for building high-performance large-scale applications. PETSc library includes routines for vector manipulation, sparse matrix computations, distributed arrays, linear and non-linear solvers, and extensible PDE solvers.

This application note focuses on building PETSc for Intel® Architecture Processors including IA-32 and Intel® 64 architecture, running Linux*.

Version Information
This document applies to Intel® MKL 2018 update 2 for Linux* and PETSc 3.8.4

Step 1 – Download

PETSc releases are available for download from the PETSc web site at
To get Intel® MKL go to /en-us/intel-mkl/.

Step 2 – Configuration

  • Use the following command to extract the PETSc files. A new folder petsc will be created:
    $git clone -b maint petsc
    $ tar –xvzf petsc-3.8.4.tar.gz
  • Change to the PETSc folder:

$ cd petsc-3.8.4

  • Set the PETSC_DIR environment variable to point to the location of the PETSc installation folder. For bash shell use:

$ export PETSC_DIR=$PWD

Step 3 – Build PETSc

PETSc includes a set of python configuration files which support the use of various compilers, MPI implementations and math libraries. The examples below show options for configuring PETSc to link to Intel MKL BLAS and LAPACK functions. Developers need to ensure that other options are configured appropriately for their system. See the PETSc installation documentation for details:

Intel provides blas/lapack via Intel® MKL library. It usually works from GNU/Intel compilers on linux and MS/Intel compilers on Windows. One can specify it to PETSc configure with for eg: --with-blaslapack-dir=/opt/intel/mkl

If the above option does not work - one could determine the correct library list for your compilers using Intel MKL Link Line Advisor and specify with the configure option --with-blaslapack-lib

  • Invoke the configuration script with the following options to build PETSc with Intel MKL (installed to the default location /opt/intel/mkl).
  • For Intel processors with Intel 64 use the following option:

$ ./config/

  • For Intel 32-bit processors use the following options:

$ ./config/

*P.S. If you ​get error message about lack of linking with some other MKL library while you execute python source file, please try to set blas library instead of directory, --with-blas-lapack-lib=\"$MKLROOT/lib/<intel64|ia32>/\"

  • Use the make file to build PETSc:

    $ make all

Step 4 - Run PETSc

Run the PETSc tests to verify the build worked correctly:

$ make test

Enabling Intel® MKL Sparse Linear operation in PETSc applications

PETSc users now can also benefit from enabling Intel® MKL sparse linear operations inside their application. Development version of PETSc distributed via now supports analogue for AIJ matrix format that calls Intel® MKL kernels for matrix vector multiplication. With this update, PETSc users can easily switch to Intel® MKL for sparse linear algebra operations and get performance benefit for most of PETSc solvers.

The following Intel® MKL functionality is currently supported in PETSc:

              Intel® MKL BLAS/LAPACK as basic linear algebra operations

              Intel® MKL PARDISO and Cluster PARDISO as direct solver

              Intel® MKL Sparse BLAS IE for AIJ matrix operations

In progress are the following extensions:

               Support for BAIJ, SBAIJ formats

               Support for MatMatMul() operation via Intel® MKL Sparse BLAS IE calls

               Support for triple product operation via Intel® MKL Sparse BLAS IE calls

For any specific requests, please contact Intel OnlineServiceCenter  or forum

How to use PETSc development copy with enabled Intel® MKL Sparse BLAS IE:

  • Download last version of PETSc. Instructions for download can be found on PETSc website.
  • Configure PETSc with MKL by adding --with-blas-lapack-dir=/path/to/mkl to configuration line.

Example: ./configure --with-blas-lapack-dir=/path/to/mkl --with-mpi-dir=/usr/local/mpich

More information on PETSc installation can be found here.

  • When running PETSc application pass “-mat_type aijmkl” to executable or set matrix type using MatSetType(A,MATAIJMKL) call in source code. For more information, see PETSc examples.

Appendix A – System Configuration

PETSc build and testing was completed on a system with an Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz running Ubuntu* 16.04.4 LTS

Appendix B - References


For more complete information about compiler optimizations, see our Optimization Notice.



I have a C program that uses PETSC, now I need to use some functions from lapacke and scalapack. I configured my PETSC as described above but now when I am trying to include "mkl.h" or "mol_scalapack.h" in my program it says,

error: mkl.h: No such file or dictionary

warning: implicit declaration of function 'LAPACKE_dgeqp3'

warning: implicit declaration of function 'pdgeqpf'

Can you please tell me how to compile the pets program with mkl so that I can use functions from apace and scalapack

my makefile is as below,



include ${PETSC_DIR}/conf/variables
include ${PETSC_DIR}/conf/rules
include ${PETSC_DIR}/conf/test

mpicgnew: mpicgnew.o    chkopts
    -${CLINKER} -o mpicg mpicg.o ${PETSC_LIB}




I built PETSc with Intel MKL in both Linux and Windows-7 (with Intel compilers). For some reason the Windows PETSc lib is much slower than the Linux lib in solving linear equation systems (the Linux and Win-7 workstations have about the same speed based on other tests). The following are the Intel MKL that I used:

Linux: /apps/compilers/intel_2011/composerxe-2011.4.191/mkl/lib/intel64

Win-7: C:\Program Files (x86)\Intel\Compiler\11.1\051\mkl\em64t

Is the Linux MKL lib newer or better optimized than the Windows version? Can I get the similar version in Windows as the one in Linux?



the second is newer one. It is 10.2 Update 2. More MKL versions can be found here:



I am trying to build PETSc-3.4.2 with Intel MKL BLAS/LAPACK in Windows-7, and noticed that there are two versions of Intel MKL libraries in my Win-7 workstation:

  1. C:\Program Files\Intel\MKL\\em64t\lib
  2. C:\Program Files (x86)\Intel\Compiler\11.1\051\mkl\em64t\lib

I would like to use the most optimized one (the fastest one), Which one should I use?

Many thanks,





I am trying to build PETSc-3.4.2 with Intel MKL BLAS/LAPACK in Windows-7, and noticed that there are two versions of Intel MKL libraries in my Win-7 workstation:

  1. C:\Program Files\Intel\MKL\\em64t\lib
  2. C:\Program Files (x86)\Intel\Compiler\11.1\051\mkl\em64t\lib

I would like to use the most optimized one (the fastest one), Which one should I use?

Many thanks,





when i did like the article , I got error:

ld: cannot find -lPEPCF90
but lPEPCF90 is ivf 7.0 and eariler


Install complete. It is useable with PETSC_DIR=/home/chengwl/opt/petsc-3.1-p8 [and no more PETSC_ARCH].
Now to check if the libraries are working do (in current directory):
make PETSC_DIR=/home/chengwl/opt/petsc-3.1-p8 test
[chengwl@hksbs-s13 petsc-3.1-p8]$ make PETSC_DIR=/home/chengwl/opt/petsc-3.1-p8 test
Running test examples to verify correct installation
C/C++ example src/snes/examples/tutorials/ex19 run successfully with 1 MPI process
--------------Error detected during compile or link!-----------------------
ifort -c -g -I/home/chengwl/opt/petsc-3.1-p8/include -I/home/chengwl/opt/petsc-3.1-p8/include -I/home/chengwl/opt/petsc-3.1-p8/include/mpiuni -I/home/chengwl/opt/petsc-3.1-p8/include -I/home/chengwl/opt/petsc-3.1-p8/include/mpiuni -o ex5f.o ex5f.F
ifort -g -o ex5f ex5f.o -Wl,-rpath,/home/chengwl/opt/petsc-3.1-p8/lib -L/home/chengwl/opt/petsc-3.1-p8/lib -lpetsc -lX11 -Wl,-rpath,/opt/intel/composerxe-2011.4.191/mkl/lib/intel64 -L/opt/intel/composerxe-2011.4.191/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lPEPCF90 -ldl -L/opt/intel/Compiler/11.1/064/lib/intel64 -L/opt/intel/composerxe-2011.4.191/compiler/lib/intel64 -L/opt/intel/composerxe-2011.4.191/mkl/lib/intel64 -L/opt/intel/cce/10.1.018/lib -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -limf -lsvml -lipgo -lirc -lgcc_s -lirc_s -lifport -lifcore -lm -lpthread -lm -ldl -limf -lsvml -lipgo -lirc -lgcc_s -lirc_s -ldl
ld: cannot find -lPEPCF90
make[3]: [ex5f] 错误 1 (忽略)
/bin/rm -f ex5f.o
Completed test examples

mad\aksharp's picture

The behavior you've described, listed below, is suspicious and will be treated as a performance defect. Please submit your test case(s) to either Premier Support ( or the Intel MKL User Forum ( so we can reproduce and invsestigate the problem:
- not seeing performance benefits from PETSC with MKL on large applications
- not seeing peformance benefits when using MKL versus FORTRAN BLAS compiled with Intel Fortran
- PETSC 2.3.3-p15 appears to be faster without MKL on the large-scale code

--Amanda S.
Intel Corp.

[Second submit is just to correct e-mail address; sorry for mistyped e-mail address in first submission.]

PETSC 2.3.3-p15 and PETSC 3.0.0-p6 both work fine with MKL. And the above instructions work.
However, the instructions refer to an old version of PETSC. Generically, those instructions can be easily adapted.

What is troubling, however, is that in the experiments that I have done, I have seen counter-intuitive performance behaviors. I have not seen performance benefits from PETSC on large applications
a) when moving to newer versions, b) when using MKL versus FORTRAN BLAS just compiled with Intel Fortran Compiler.

PETSC 2.3.3-p15 appears to be faster without MKL on the large-scale code I've been experimenting with than with it. PETSC 3.0.0 without MKL appears to be much slower than PETSC 2.3.3 without MKL, and PETSC 3.0.0 with MKL is a bit faster than PETSC 3.0.0 without MKL...

One thought is that for situations with small problem sizes, MKL is less efficient than Fortran blas, but we're still trying to figure out how you can lower performance, when MKL works very when benchmarked.
That is, FORTRAN BLAS have higher small problem performance!?!

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.