VTue showes about 30% time is on the function mkl_blas_p4_dinner_general_large. It's an internal function of MKL . I couldn't find any infomation about it. So could anyone tell something about it. And what could I do to optimize?
Need help on mkl_blas_p4_dinner_general_large
For more complete information about compiler optimizations, see our Optimization Notice.

