Non-square Matrix Transpose

Hi guys,

Are there any highly optimized MKL routines or maybe performance primitives that can do rectangle matrix transposition but without scaling?

I've been using mkl_omatcopy but it seems to perform worse than a normal baseline implementation and I suspect this is due to the additional scaling that is performed. I've attached a plot running a naive baseline implementation with comparison on omatcopy and imatcopy. The latter I know runs very poorly on non-square matrices. 

vtune 2013 and very long run hotspot analysis

A customer using vtune 2013 tried this command line for a 40+ hour long run, 16 MPI ranks - I know, it's asking a lot.  Customer reported:

"We've been trying to use it for a problem that runs for 40+ hours/

The elapsed time reported in vtune is far less than it should be, something like 1 hour in this case. Also, the finer-grain times reported when browsing the functions appear to be far smaller than expected.

mkl_ddiamv multi-thread support

Does mkl_ddiamv support multi-thread calculation? 

In my project's properties I set "Use Intel MKL" to "Parallel". The majority of time of my programm is matrix-vector multiplication, which is performed by mkl_ddiamv. In my task manager I always see, that ONLY ONE thread is busy by my programm process. I use Intel C++ Compiler 14.0. I try to take many different sizes of multiplicated arrays. There is no difference. I unsuccessfully tried to find some info about the reason of one-thread calculation. 

Intel Compiler doesn't generate program database for release mode


I have again a case where compiling my huge program with MSVC works just fine, with ICC -O2 it ends up with almost neverending compilation and linker problems and with -O1 it seems to generate incorrect code (it immediately crashes). So I want to check the code itself, but despite I specify I want a PDB, so that MSVC can understand the code, ICC doesn't report anything and just doesn't create the PDB... really frustrating... This is the command line:

Subscribe to Threading