I have issues when using Intel MKL to provide Blas+Lapack+ScaLapack. Specifically, i use it with OpenMPI+GCC and link against libmkl_scalapack_lp64, libmkl_blacs_openmpi_lp64, libmkl_intel_lp64, libmkl_core, libmkl_sequential. The issues is with calling pdpotri after Cholesky factorization:
pdpotri_ (&uplo,&n_columns, A_loc, &submatrix_row, &submatrix_column, descriptor,&info);
void pdpotri_(const char *UPLO, const int *N, double *A, const int *IA, const int *JA, const int *DESCA, int *INFO);
I don't have any issues with my code on both Ubuntu and macOS when using Netlib-Scalapack 2.0.2 + Openblas 0.2.20. For those cases, calculation of Cholesky factorization followed by pdpotri produce correct results and agree with serial Lapack. Also note that other small test programs I have (i.e. calculate L1 norm or do Cholesky factorization) run ok with Intel-MKL. This suggest that something is wrong in Intel-MKL implementation of pdpotri.
A bit more on the issue itself: this happens when run with 4 MPI cores (2x2 grid) and a small test program; the program runs ok with matrix 64x64 with 32 blocks but fails for 120x120 with 32 blocks. Specifically, i see a floating point exception from process rank 2. Not sure I can debug it further on my side.
p.s. That's Intel-MKL 2017.3.196.