Matrix Q of QR decomposition

Matrix Q of QR decomposition

Hello,
I need to obtain matrix Q of QR decomposition, so I've been
using function *geqrf followed by *orgqr and it performs well. The
problem is when I use this functions with threaded mkl where i've got
good speed up with *geqrf, but no speed up with *orgqr. I've seen user's
manual and it seems that *orgqr is not a threaded function. I have also tested function 'dormqr()' being matrix C an identity matrix and
as expected, I have same results as using 'dorgqr()' function. As we
can see in the manual, 'dormqr()' is a threaded function, so I've got
matrix Q faster than with 'dorgqr()', but for my surprise when I run
both functions in sequential MKL 'dormqr()' is near 3 times faster than
'dorgqr()'. How is it posible if both functions make the same and
'dormqr()' also make a matrix-matrix multiplication?

thanks!! :)
Jorge

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Jorge,1. Yes you are right. The allfunctions ?(or/un)gqr are not threaded. We will do that in one of the next updates.2. regardingdormqr : what size of the tasks?--Gennady

I use dorgqr and dormqr with same matrix dimensions (2250x2249) of matrix Q. Is it important for their performance??
Thanks Gennady

For the dormqr these sizes are enough to see the performance benefitsof using threading.

I know, but my question is why dormqr is faster (near 3 times) than dorgqr if I'm executing both in sequential MKL??

Leave a Comment

Please sign in to add a comment. Not a member? Join today