Fortran intrinsic functions v.s. mkl functions or subroutines

Fortran intrinsic functions v.s. mkl functions or subroutines

Hi all,

I am wondering  which I should  use in my code, for example if I do matrix multiplication A(100,100)*B(100,100), matmul(A,B) or gemm()?

The same uncertainty for other functions, e.g. dot_product, and those VML functions, e.g. exp(A)  v.s. vsexp().

Let's ignore parallelization, because mostly I do these operations for each openmp thread.

Thank you.

Benqiang

3 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

Hi Benqiang,

MKL DGEMM is well optimized for the large problem size. For the matrix size of (100,100), dgemm expect to have a better performance.  There is a post discussed here: http://software.intel.com/en-us/forums/topic/269726
matmul may be faster in a very small case, but for large problem size, MKL is well optimized and have performance.

For the VML functions, both MKL and compiler provides vectorized functions and have good performance. In the MKL , it also provide precision control ( by setting VML_HA/VML_LA/VML_EP), so it provide more options to balance the precision and performance.

For some dot_product function, the code is very simple. The compiler could well optimize the code,so  Both the compiler and MKL can have good performance there.

Thanks,
Chao 

Hi everybody,

>>...
>>A triple do-loop takes 13.089 seconds.
>>A matmul(a,b) function takes 33.056 seconds.
>>A DGEMM subroutine takes 1.840 seconds
>>...

I'd like to add that test results at the end of the thread mentioned by Chao are questionable ( very outdated! ) and a classic ( triple do-loop ) can not outperform Fortran's MATMUL function.

We recently tested several matrix multiplication functions and please take a look at a thread:

Forum Topic: Haswell GFLOPS
Web-link: http://software.intel.com/en-us/forums/topic/394248

Note: Page 2 of the thread has the most interesting information with test results for matrix sizes 4Kx4K, 8Kx8K and 16Kx16K.

Deixar um comentário

Faça login para adicionar um comentário. Não é membro? Inscreva-se hoje mesmo!