Waht is wrong with cblas_dgemv

Waht is wrong with cblas_dgemv

Hello,

   I am interested in the following operation:

Y = A^T * X

using the cblas_dgemv or cblas_dgemm routine in MKL. In the code I use small letters: y, a, and x.

I have a test example as follows:

#include <cstdio>
#include <cstdlib>
#include "mkl.h"

void main()
{

 double a[6] = {2., 3., 1., 1., 0., 2. }; // 3 x 2 matrix
  double x[3] = {1., 1., 1.};
  double y[2] = {0., 0.};
  int m = 3;
  int n = 2;
  double alpha = 1.0;
  double beta = 1.0;
  int incx = 1;
  int incy = 1;
  int i,j;

  for (i=0; i<m; i++){
    for (j=0; j<n; j++)
      printf("%g ", a[i*n + j]);
    printf("\n");
  }

  cblas_dgemv(CblasRowMajor, CblasTrans, m, n, alpha, a, m, x, incx, beta, y, incy );

  for (i=0; i<n; i++)
    printf("%g\n", y[i]);

}

After running the code, I get wrong results. Can someone point out the mistake?

Regards,

Pawan

11 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Best Reply

The seventh argument to cblas_dgemv should be n. The convention is that a matrix is passed by specifying four attributes:

  •  the address of the base, i.e., the address of the (1,1) element in Fortran or the [0][0] element in C/C++, 
  • the number of rows, m
  • the number of columns, n
  • the stride from one row to the next, if using row-major order as in C/C++, or the stride from one column to the next, if using column-major order as in Fortran.

You are passing a 3 X 2 matrix using row-major order; therefore, the stride from each row to the next is 2.

There is some avoidable confusion in the MKL documentation; this is caused by using Fortran-centric terms such as "leading dimension", which a C/C++ programmer may take to be something different.

Thanks. It works now.

This was tricky! I will apply this trick also to the dgemm routine.

regards,
Pawan

For dgemm, it was confusing as well. I put an working

example below for someone in need.

Goal: C := alpha A^T * B + beta * C, with alpha = 1; beta = 0;

Use the following:

 cblas_dgemm(CblasRowMajor, CblasTrans, CblasNoTrans, A.ncols(), B.ncols(), A.nrows(),

                       alpha, A.vec, A.ncols(), B.vec, B.ncols(), beta, C.vec, B.ncols())

  Here I take:

     LDA: stride for A = A.ncols();  // it would have been A.nrows(), if transA = 'N'

NOTE: Notice that this is a bit confusing when compared to dgemv, because for dgemm, the number of rows and number of columns of A^T are input. By this logic, one may take LDA to be A.nrows() above, because the rows of A^T have strides of A.nrows(). In other words, although for dgemm, we are required to put number of rows and number of columns of A^T, for LDA we must consider strides for A not for A^T.

     LDB: stride for B = B.ncols();

     LDC: stride for C = B.ncols();

Regards,

Pawan

 

 

Best Reply

Hi Pawan,

Thank you a lot for your comments about dgemv and dgemm.

Just add some

based on the documentation of dgemm: Cmxn=op(A)mxk*op(B)kxn

m INTEGER. Specifies the number of rows of the matrix op(A) and of the matrix C. The value of m must be at least zero

k, INTEGER. Specifies the number of columns of the matrix op(A) and the number of rows of the matrix op(B).

Where the documetnation of dgemv, the m. n is from A and LDA is  A.

m INTEGER. Specifies the number of rows of the matrix A. The value of m must be at least zero.
n INTEGER. Specifies the number of columns of the matrix A. The value of n must be at least zero.

So you are exctaly right, that for dgemm, we are required to put number of rows and number of colums of A^T when transA=T.

But LDA is  the stride of A both the dgemv and dgemm, whatever transA= N or T.

I guess, you may have tried

1. cblas_dgemm(CblasRowMajor, CblasTrans, CblasNoTrans, 2, 1, 3,  1.0, a, 2, x, 1, 0, y, 1);   get y=(6, 3)

2. cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 2, 1, 3,  1.0, a, 3, x, 1, 0, y, 1); get y=(3,6)

thus happen to get the conclustion of  LDA=A.nrows when trans='N'.  Actually, it is not correct because the A are different under the two cases.  for example, if you change the original A =  {2., 3., 5., 1., 0., 2. }, you will see the difference more obviously.

2 3 )T        1      m=3, n=2                                     2   3    5               1        m=2, n=3

5 1     *     1   =   7                                               1   0     2   *          1   =     10

(0 2    *      1        6                                                                          1            3

Hope it can lighten some confustion.

Best Regards,

Ying

 

 

 

Thanks Ying for a correction. I quote your correction:

"But LDA is the stride of A both the dgemv and dgemm, whatever transA= N or T. "

Regards,
Pawan

Hello!

I have a problem with the function dgemv so I thought to post it here. If I should open a new topic please let me know to fix it.

 

Let A1, A2, A3 be 4x4 matrices.

A1 = 5  6  7  8     , A2 = 21 22 23 24  , A3 = 37 38 39 40

       9 10 11 12              25 26 27 28            41 42 43 44

      13 14 15 16             29 30 31 32            45 46 47 48

      17 18 19 20             33 34 35 36             49 50 51 52

I defined in this way in my code

double A[48];

double y[4];

for( int i=0; i<48; i++) {A[i] = i+5;}.

 

I want to multiply the matrix A1 with the vector

 

double pi0 = {0.1, 0.2, 0.3, 0.4};

so I did this

     

		 cblas_dgemv(CblasRowMajor, CblasNoTrans, 4, 4, 1.0, &A[0:15], 4, pi0, 1, 0.0, y, 1);

The result I got is {21, 25, 29,31} , although the true result is {7, 11 , 15 ,19};

One solution that worked was to copy the first 15 values of the matrix A to another matrix B, then when I did
 

int B[16];

        for(int i=0; i<16; i++)
        {B[i] =A[i];
        }
cblas_dgemv(CblasRowMajor, CblasNoTrans, 4, 4, 1.0, B, 4, pi0, 1, 0.0, y, 1);

it worked.

 

The problem is that I don't understand what went wrong in the first way. Also if I first copy the matrix I think that this will require more time. Please, help me.

Thank you.

Hi

Replace &A[0:15] by A.

 Does it work now?

  cblas_dgemv(CblasRowMajor, CblasNoTrans, 4, 4, 1.0, A, 4, pi0, 1, 0.0, y, 1);

Regards,

Pawan

Hi! Yes, now it works. Could I ask you something? How do we find the lda? I have understood that if we have a matrix mxn and we save it row-by-row then the lda is equal to the number of columns. But, how do we decide in a situation like here? I mean when we have 3 matrices which we have defined firstly the first matrix row-by-row, then the second matrix row-by-row and etc. 

If  we want to compute the product of the matrix A2 (or A3) with the vector pi0 how do we select the lda?

In general how do we decide the lda  in we have P matrices (not 3). (I have a problem like this but with P matrices, where P =0,1,2,3,...)

If you know a book, notes , anything that could help me with that please tell me.

 

Thanks for helping me!

Hi Anas,

Right, lda is generally equal to the number of columns when A is mxn.

About the complex situation like A[48], it is a 1-D array in C. and

A1={A[0],  A[15]};

A2={A[16],  A[31]};

A3={A[32], .. A[47];

As you see, the A1, A2, A3, can be accessed by regular patten as a 1-D C-array.  The adress of A2[mxn] is &A[16]. and it's mxn elements are continous elements in A[16] .. A[31], so lda is still n. you can just move the first pointer to access A2, A3. for example,   

if  A2 * pio,  call cblas_dgemv(CblasRowMajor, CblasNoTrans, 4, 4, 1.0, &A[16], 4, pi0, 1, 0.0, y, 1);  or A+16

if A3* pio,   call cblas_dgemv(CblasRowMajor, CblasNoTrans, 4, 4, 1.0, A+32 , pi0, 1, 0.0, y, 1);   or &A[32]

The key point is how you access the array in C array. if only the array can be accessed by some regular pattern as a C array, then you can decide the input parameter of A.

Best Regards,

Ying

P.S. Please ignore the syntax I mentioned last time about Array notation like A[0:15]. It is special syntax only for Array notation, can't be taken as a C pointer to pass to MKL funtion.

I had serious problems with my pc and I just saw the answer. Ying, you are great! I have no words to express my appreciation. Thank you.

Leave a Comment

Please sign in to add a comment. Not a member? Join today