Wrong Answer from cblas_dgemm and dgemv when matrix not square

Wrong Answer from cblas_dgemm and dgemv when matrix not square

bcarlson@dynamic-concepts.com's picture

Hello, I'm new to using the MKL, and I'm getting incorrect answers when my "a" matrix isn't square. I'm running the MKL under Fedora Core 5 with the latest kernel update.

I'm using the following matrices and vectors.

a = [ 1 2 ] b = [ 11 12 13 ]
[ 3 4 ] [ 14 15 16 ]
[ 5 6 ]

d = [ 2 ]
[ 4 ]

for a*b using
cblas_dgemm(order, transa, transb, m, n, k, alpha, (double *) &a[0][0], lda, (double *) &b[0][0], ldb, beta, (double *) &c[0][0], ldc)

with

m = 3 ;
n = 3 ;
k = 2 ;
alpha = 1.0 ;
lda = 3 ;
ldb = 2 ;
ldc = 3 ;
beta = 1.0 ;

CBLAS_ORDER order = CblasRowMajor ;

CBLAS_TRANSPOSE transa = CblasNoTrans ;

CBLAS_TRANSPOSE transb = CblasNoTrans ;


Results in a Segmentation Fault.

When I do the matrix-vector multiplication a*d
with the following parameters

m = 3 ;
n = 2 ;
alpha = 1.0 ;
lda = 3 ;
beta = 1.0 ;
incx = incy = 1.0 ;

i get..

e = [ 10.000000 ] when it should be e = [ 10.0 ]
[ 28.000000 ] [ 22.0 ]
[ 132.000000 ] [ 34. 0 ]

I've attached my makefile and code (matmult.CPP) below.

Thank you for any help you can provide.

Brian

----- MAKEFILE --------------

OBJECTS = matmult.CPP

MKL_DIR = /opt/intel/mkl/8.1.1/lib/32/

HEADERS =

matmult: $(OBJECTS)
g++ $(OBJECTS) -L$(MKL_DIR) -lguide -lmkl -lmkl_lapack -lmkl_ia32 -o matmult -lm

matmult.o: $(HEADERS) matmult.CPP
g++ -c matmult.CPP

----- CODE (matmult.CPP) ------------

#include
#include
#include
#include "/opt/intel/mkl/8.1.1/include/mkl.h"

#define A_ROWS 3
#define A_COLS 2
#define B_ROWS 2
#define B_COLS 3
#define C_ROWS 3
#define C_COLS 3
#define D_ROWS 2
#define E_ROWS 3


int main ()
{

int i,j,counter ;
int m, n, k ;

int a_rows = A_ROWS ;
int a_cols = A_COLS ;
int b_rows = B_ROWS ;
int b_cols = B_COLS ;
int c_rows = C_ROWS ;
int c_cols = C_COLS ;
int d_rows = D_ROWS ;
int e_rows = E_ROWS ;


double alpha, beta ;
int lda, ldb, ldc ;

double a[A_ROWS][A_COLS], b[B_ROWS][B_COLS], c[A_ROWS][B_COLS] ;
double d[A_COLS], e[A_ROWS] ;


CBLAS_ORDER order = CblasRowMajor ;
// CBLAS_ORDER order = CblasColMajor ;
CBLAS_T
RANSPOSE transa = CblasNoTrans ;
CBLAS_TRANSPOSE transb = CblasNoTrans ;


for ( i = 0 ; i < a_rows ; i++ )
{
for ( j = 0 ; j < a_cols ; j++ )
{
counter++ ;
a[i][j] = counter ;
}
}

for ( i = 0 ; i < b_rows ; i++ )
{
for ( j = 0 ; j < b_cols ; j++ )
{
counter++ ;
b[i][j] = counter ;
}
}

for ( i = 0 ; i < c_rows ; i++ )
for ( j = 0 ; j < c_cols ; j++ )
c[i][j] = 0.0 ;

counter = 0 ;
for ( i = 0 ; i < d_rows ; i++ )
{
counter++ ;
counter++ ;
d[i] = counter ;
}

for ( i = 0 ; i < e_rows ; i++ )
e[i] = 0.0 ;

printf("
a =
") ;
for ( i = 0 ; i < a_rows ; i++ )
{
for ( j = 0 ; j < a_cols ; j++ )
{
printf(" %lf ",a[i][j]) ;
}
printf("
") ;
}

printf("
b =
") ;
for ( i = 0 ; i < b_rows ; i++ )
{
for ( j = 0 ; j < b_cols ; j++ )
{
printf(" %lf ",b[i][j]) ;
}
printf("
") ;
}

printf("
d =
") ;
for ( i = 0 ; i < a_cols ; i++ )
{
printf(" %lf
",d[i]) ;
}
printf("
") ;

m = a_rows ;
n = b_cols ;
k = a_cols ;
alpha = 1.0 ;
lda = m ;
ldb = k ;
ldc = m ;
beta = 1.0 ;

// CALLING matrix-matrix multiplication function (cblass_dgemm)
cblas_dgemm(order, transa, transa, m, n, k, alpha, (double *) &a[0][0], lda, (double *) &b[0][0], ldb, beta,
(double *) &c[0][0], ldc) ;

m = a_rows ;
n = a_cols ;
alpha = 1.0 ;
lda = m ;
beta = 1.0 ;

// CALLING matrix-vector multiplication function (cblass_dgemv)
cblas_dgemv(order, transa, m, n, alpha, (double *) &a[0][0], lda, &d[0], 1, beta, &e[0], 1) ;

printf("
c =
") ;
for ( i = 0 ; i < a_rows ; i++ )
{
for ( j = 0 ; j < b_cols ; j++ )
{
printf(" %lf ",c[i][j]) ;
}
printf("
") ;
}

printf("
e =
") ;
for ( i = 0 ; i < a_rows ; i++ )
{
printf(" %lf
",e[i]) ;
}
printf("
") ;


exit(0) ;
}

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
bcarlson@dynamic-concepts.com's picture

To the greatest extent that I can do I think I've proved (to myself atleast) that the A matrix, being loaded into the cblas_dgemm and cblass_dgemv routines must be square in order for the result to be correct?


Can anyone verify this for me?


Seems like something that belongs in the documentation.

Login to leave a comment.