On my pagehttp://www.thesa-store.com/products/ (it is not currently available: see below) (for processor P4, item 2.2) was seen compared my algorithm and the algorithm proposed in the late last century, Inderjit S. Dhillon and sold in a package Interl MKL, both in speed and accuracy, and the results of comparison were not in favor of this algorithm.
And what we have now? With regard to the orthogonality of vectors, then the implementation in recent releases Intel encouraging. A parallelization in dstegr Intel MKL is not implemented, and speed problems.
For the tridiagonal matrix from paragraph 2.2 of the size n = 30001 is my result - with 56.6 sec (hardware configuration: i7 860 processor (Speed: 2.80 GHz), Motherboard DP55KG,DDR31333 MHz (8 GB), OS Windows XP Professional x64 Edition SP2,Intel MKL 10.2 Update 4, EM64T,HT off). And dstegr Intel MKL provides 19 min. 37 sec. (result is given to the frequency of 2.80 GHz to compensate for the turbo boost, because parallelization in dstegr Intel MKL is not implemented). The difference in more than 20 times!
The results presented here relates to an improved algorithm, on which information is published on my page. I also want to note that the parallelization of my algorithm is not complete (work on a full parallelization is), making it an advantage over the RRR algorithm even more impressive.Regarding the accuracy of the eigenvectors, it is not inferior RRR algorithm. My web page (it is not currently available) and publications, which used my diagonalization, can be downloaded here: http://depositfiles.com/files/fmy2ueaad