Intel® Math Kernel Library

BLACS broadcast 64-bit integer


I'm trying to broadcast a 64-bit integer with BLACS routines IGEBS2D() and IGEBR2D() --- Centos 6.5 Linux, ifort, composer_xe_2013_sp1.2.144, intel64, Intel MPI, ilp64 libs.  Despite declaring all integers as integer*8, compiling with -i8 and linking exclusively with ilp64 libs, only 32-bits of the 64-bit integer seem to be broadcast.  My compile line is:

  mpiifort -i8 -o demo1 demo1.f -warn all -L${MKLROOT}/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lmkl_blacs_intelmpi_ilp64 -lpthread -lm

Sample program:

Should I place a barrier before calling pdpotri()?

I am using pdpotrf() in order to perform the Cholesky decomposition. Then I want to call pdpotri(), in order to invert the matrix. The function is called from every process, just after pdpotrf(). Should I put a barrier there, so that I am sure that all the processes are done with the Cholesky factorization, and then move on to the inversion part, or it's not needed?

Calling LAPACK/MKL from parallel OpenMP region

Dear All,

I often call some BLAS and LAPACK (MKL) routines from my Fortran programs. Typically, I try to place these calls outside of any parallel OpenMP regions while then making use of the parallelism inside the routines.

In a new part of the code, however, I need to call DGESV from a "omp parallel" region (see dummy code below). The below code will crash as all threads call GESV individually. Putting the GESV call into a "omp single" section works, but limits the performance, as it is running in one thread only.

Computation of the Schur complement in MKL

Hello everyone,

I recently came across a project in which I have to compute the Schur complement of a complex symmetrix matrix. I know that starting from Intel® MKL 11.2 update 1,  MKL supports the computation of Schur complements. However, I have two questions that doesn't seem to have been answered somewhere. Note: I am writing a C code.

a) Does this functionality supports complex arithmetic? I know that the PARDISO implementation in MKL supports complex arithmetic, but does this apply to the Schur complement computation also?

Distributed Cholesky

I am doing a distributed Cholesky. You can find the code I am using (almost the same) here. I am gathering the submatrices (after every node has executed Cholesky) into the master node exactly as shown in this example. I am wondering if I can do better, by passing better parameters, especially for the block size, which may result in faster execution.

Possible dgetrf IPIV issue

Hello, I am attempting to use dgetrf to get an LU factorization of a square matrix as part of a large mex program. When I check the output of dgetrf, I find the IPIV contains both a 0 and a number which is size of the matrix. I checked the documentation and it says the zero should not be there.

I have been able to reproduce this error in a smaller test case:

The C script (test_case.c)

Question: cycle count of 65536 MKL FFT DftiComputeForward(C++)

My code as followings:


fft_mkl(int M,float * InputData,float * OutputData)


MKL_LONG status;

DFTI_DESCRIPTOR my_desc1_handle;
DftiCreateDescriptor( &my_desc1_handle, DFTI_SINGLE,DFTI_COMPLEX, 1, M);
DftiSetValue( my_desc1_handle, DFTI_PLACEMENT, DFTI_NOT_INPLACE);
DftiCommitDescriptor( my_desc1_handle );
status = DftiComputeForward( my_desc1_handle, InputData, OutputData);
status = DftiFreeDescriptor(&my_desc1_handle);



float *test = new float [65536*2];

Subscribe to Intel® Math Kernel Library