I'm writting a program that calls the geqp3 frequently. Because the mkl subroutine is geqp3(A,jpvt), Then I need an array A.
However I want tp call geqp3 frequently. In each call the array size may be slightly different. I know that if at each time, I allocate the memory for A and jpvt, and call geqp3 and then deallocate A and jpvt, it works. However the speed becomes slow. Especially when using open MP to parallelize this program, the CPU used is only 30% if the number of threads are set to 32 (my server has 16 cores and 32 threads).
Is there any simple tip to solve this problem. Thank you very much!