MKL Linear Least Squares Issues

MKL Linear Least Squares Issues

I posted a question about a week about pertaining to the performance of <?>gels routines versus manual implementation. I mentioned that I had noticed a significant increase in speed (compared to the manual effort) when using MKL functions for matrices less than 50000 elements, but as soon as I started to approach this mark the speed decreased drastically.

I am trying to use cgels, cgelss, cgelsy, and cgelsd to compute the LLS solution approaching realtime. The matrices being used are A(100000,8) and b(100000,3).

Also, when using these methods I am having trouble locating the solution matrix in the 'b' variable. I have been using something like the following pseudo code, but I seem to come up with varying answers and odd formats. I was expecting the first 8 rows and 3 columns to hold the solution vectors but the answers are a bit off in comparision to the manual method.

  • complex<float> matrix A(8,100000); complex<float> matrix b(3,100000); complex<float> matrix X(8,3);
  • A = transpose(A); b = transpose(b);
  • m = 100000; n = 8; nrhs = 3;
  • lda = 100000; ldb = 100000;
  • matrix_order = ROW_MAJOR;
  • <?>gels(matrix_order, m, n, nrhs, A, lda, b, ldb, ...)

Any input in appreciated. Thank you again for your time.

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

There are examples of calling the lapack_e routines LAPACKE_cgels(), etc., in the mkl/examples/lapacke/source directory. The solution is returned in the first n rows of the matrix b.

There is not much that one can say about your "manual implementation", etc., since we have no inkling as to what they might entail. I suggest that you first reach a stage where you can obtain correct results with MKL and with your other methods, and establish that the two (or more) results agree. Only alter that is done would it make sense to compare running times, because the execution time of a program is of no significance if that program does not run to completion and produce correct results.

Leave a Comment

Please sign in to add a comment. Not a member? Join today