Intel® Math Kernel Library

mkl_scscmm performance problem


I am building the attached program in OLCF's RHEA supercomputer (  with Intel compiler icc (ICC) 14.0.4 20140805. Armadillo has a naive implementation of cscmm. I run the program with MKL_NUM_THREADS=1 in rhea to multiply the sparse matrix of size 83328x124992 with a dense matrix of size 124992x50. The following is the output.

 ./a.out 83328 124992 50 0.00001

The output of the test code

Troubles with undefined _MKLMPI_Get_wrappers

Hello all,


recently, I successfully compiled and linked the MKL Pardiso solver within a Fortran 2008 code by including mkl_pardiso.f90 into the code and using the following compiler and linker flags:


-m64 -O2 -I$(MKLROOT)/include -static-intel -L$(MKLROOT)/lib -mkl -qopenmp -qopenmp-link static


MKLROOT is set to /opt/intel//compilers_and_libraries_2016.1.111/mac/mkl


Now, I tried to compile with mkl_cluster_sparse_solver.f90 instead with the same settings for the makefile and run into the following compiling error:


Better way to compute phi0 + sigma*vector?


I want to compute this quantity prob = phi0 + sigma*atilde, where phi0 and sigma are scalars and atilde a vector 1xind. I have computed it like this:

for(i=0;i<ind;i++){ones[i] = 1.0;}

 cblas_dcopy(ind, ones, 1, B, 1);
 cblas_dscal(ind, phi0, B, 1);
 cblas_dcopy(ind, atilde, 1, Bcan, 1);
 cblas_dscal(ind, sqrt(sigma2), Bcan, 1);
 vdAdd(ind, B, Bcan, prob);

I would like to ask if there is a better way to do it.

Thank you very much.

ask questions about mkl_<>csrmm function

Hi everyone,

I used mkl_<>csrmm function in deep learning, but I met with a really strange problem. One parameter in mkl_<>csrmm is called pntrb (row pointer in compressed sparse row format) and its definition is:


INTEGER. Array of length m.

For one-based indexing this array contains row indices, such that pntrb(I) - pntrb(1) + 1 is the first index of row I in the arrays val and indx.

For zero-based indexing this array contains row indices, such that pntrb(I) - pntrb(0) is the first index of row I in the arrays val and indx.

Efficient storage of 3D field data


in working on a numerical code, which solves a discretised equation on a three dimensional grid. I have multiple fields (around 80) I need to save on this grid and which are needed to compute my results. I want to perform my computations (which consist of rather simple operations (as product / dot product) to set up a sparse matrix and solve this matrix using MKL. My questions are:

1) What ist the most efficient way to store the data? Using a 1D array or a multidimensional array? At the moment I'm using a 1D array and accessing it in my innermost loop using 

Subscribe to Intel® Math Kernel Library