Using Cluster MKL PBLAS/ScaLAPACK FORTRAN routine in your C program

Introduction

Intel MKL PBLAS and ScaLAPACK routines are written in FORTRAN 77  (with exception of a few utility routines written
in C to exploit the IEEE arithmetic). It is difficult for the programmer who used to C language convention. This article describles a sample to call the Intel® Math Kernel Library (Intel® MKL) PBLAS routines from a C program to show the difference between C calling converstion and Fortran calling conversion . 
Please see the detail discussion on MKL forum.

1. Problem
    The sample is using PDGEMV(), which computes a distributed matrix-vector product y =alphaAx+beta*y.
    A= [1  4  7  10  13 ; 3   6   9   12  15 ;  5  8   11  14  17 ; 7  10  13  16  19;  9  12  15  18  21]
    and x=[1 ; 1; 0; 0; 1]T, x is a column vector, Call PDGEMV() routine to compute y=Ax. 
    The right result is y=[18; 24; 30; 36; 42]T 

2. Code sample in C: pdgemv.c
    The function call in C is like,

    double alpha = 1.0; double beta = 0.0;
    pdgemv_("N",&M,&M,&alpha,A,&ONE,&ONE,descA,x,&ONE,&ONE,descx,&ONE,&beta,y,&ONE,& amp;ONE,descy,&ONE);
 
3. Build and Run it

    Build it under linux environment, assume intel 64bit application. 
    ! set environment and build pdgemv.c 
    $source /opt/intel/mkl/10.x.x.0xx/tools/environment/mklvarsem64t.sh
    $source /opt/intel/mpi/3.x.x/bin64/mpivars.sh
    $source /opt/intel/Compiler/11.x/0xx/bin/iccvars.sh intel64
    $mpiicc -w -o pdgemv pdgemv.c -I/opt/intel/mkl/10.x.x.0xx/include -L/opt/intel/mkl/10.x.x.0xx/lib/em64t
     -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread

    ! run the application 
    $ mpirun -n 4 ./pdgemv

4. Notes

To call Fortran function in C, you need to take care of two things:

1. Pass variables by address, not by value.   for example,

double alpha = 1.0;
 pdgemv_("N",&M,&M,&alpha

2. Store your data in Fortran style, that is, column-major rather than row-major order

Intel MKL PBLAS/ScaLAPACK routines are written in FORTRAN interface, so Column-major are used and row-major is not acceptable.  If you gives the matrix to these routines in row-major - even your code is completely right from C (Row-major) point of view,  the result will be almost always wrong.

For more complete information about compiler optimizations, see our Optimization Notice.