I try to use MKL to generate lots of random data every time on Xeon phi, but the performance is very bad comparing the performance on Xeon CPU.(E5620) .
The attachment is the original code, and the compile option for Xeon Phi is -O3 -mkl -mmic. and it takes about 115 seconds, however when I run it on Xeon CPU,it only takes 3.5 seconds. I do not know why the difference is so much. Is the way in which I use the Xeon Phi wrong or the real performance on Xeon Phi is bad?