i am experiencing a weird problem using MKL 10.0.2 under Visual Studio 2005/2008 express edition.
So, i am trying to use cblas_sgemm/dgemm to do a matrix multiplication as follows:
Matrix A (m*n), where m is around 50000, n is around 50.
Matrix B (m*n).
matrix C (n*n)
i need to do C=A->transpose * B
so i wrote
cblas_sgemm(CblasColMajor, CblasTrans, CblasNoTrans, n, n, m, 1.0f, A, m, B, m, 0.0f, C, n);
and the same with double precision
cblas_dgemm(CblasColMajor, CblasTrans, CblasNoTrans, n, n, m, 1.0f, A, m, B, m, 0.0f, C, n);
basically they both work in terms of giving the right output as desired. However, when I use the sgemm with ABC as float*, it runs 5 times slower than using dgemm with ABC as double*..
could anybody help check this out????? thank you very very much !!!!!
why is cblas_sgemm 5 times slower than cblas_dgemm