The description of mkl_?omatadd function in the manual is a bit confusing:
- parameter m is described as "The number of matrix rows". Which one ? (A, B or C ?) What if I want to transpose A or B ?
- same question about parameter n.
- Why is ldc an output parameter ??
I wanna do this kind of operation "C = alpha*A + beta*B^t".
Thanks in advance for your help,
I am trying to compile Fenics in OsX using intel compilers.
The new edition to my saga is that CMAKE creates test binaries when testing MKL that can not be killed, and leads to a slow corruption of the system (i.e. fonts go crazy etc). The error is reproducible in a macbook pro retina 2011 and mac pro cylinder 2013.
The "zombie" is created when Teuchos is testing:
Performing Test CXX_COMPLEX_BLAS_WORKS
The CMAKE is invoked via hashdist with following parameters
We have been using the following code compiled with Intell Fortran on Linux clusters linked with MKL 10.2.2.025 for a number of years. We are trying to update to Intel 15 compilers and MKL 11.2 update 2. The symmetric solve produces wrong answers using the new compiler and libraries. The code has not changed. When we include the symmetric terms and solve it as a symmetric_structure matrix we get the right answers.
Is any one else having problems with symmetric solves with the current compiler and library?
Load the data into indx, vall, and y_rt ...
I have "old" codes that call Fortran numerical_libraries routine GQRUL and DGQRUL for calculating Gauss-Legendre quadrature rule to perform numerical integration. I used to be able to just put a line in the main routine "USE numerical_libraries" and subsequently was able to call GQRUL and DGQRUL functions.
I have been trying to optimize matrix multiplication on NUMA systems but so far without much luck.
I have played around with the dgemm routine and first touch.
A snippet of my code looks like this:
Is the Parallel Direct Sparse Solver for Clusters supported on Windows OS?
I'm developing an application that needs to compute various eigenvalue decompositions. Is it possible to call zfeast_heev from multiple threads in parallel? Ofcourse, each thread has it's own memory. I could not find this kind of information in the documentation. Currently I'm using zhpevd, which works fine when called from multiple threads, zfeast_heev however, do not.
Looking forward to your answers