I need to obtain matrix Q of QR decomposition, so I've been using function *geqrf followed by *orgqr and it performs well. The problem is when I use this functions with threaded mkl where i've got good speed up with *geqrf, but no speed up with *orgqr. I've seen user's manual and it seems that *orgqr is not a threaded function; is there any other possibility to obtain the matrix Q with a threaded function??
Matrix Q of QR decomposition