Information on the results of parallelizing algorithms for fast matrix multiplication using dgemm Intel MKL

Information on the results of parallelizing algorithms for fast matrix multiplication using dgemm Intel MKL

Parallel algorithms for fast matrix multiplication is non-trivial task because of the large number of quadratic operations: necessary to minimize the amount of the allocation of additional memory and it does not sacrifice speed multiplication. My recent advances in this field for 3-square matrices 16000 * 16000, located in memory and processed according to the formula C = C + A * B: my 129 seconds to 186 seconds dgemm Intel MKL (OS XP x64, the processor i7 860, 8 gigabytes of memory 1333 Mhz). The positive effect of parallelization beginning to affect the matrix is not less than 1500 * 1500. As a basic function of multiplication on the leaves of the recursion tree used dgemm Intel MKL. Created and fast algorithm for multiplication without allocating additional memory: this prize is more modest - about 8/7 on the speed dgemm Intel MKL on large matrices. There has been a positive effect for the case when one of the non-square matrices:I use it to speed up many problems in linear algebra, starting from the solution of systems of linear equations and ending with a singular analysis.

Legendary intelligence officer Drozdov was nicknamed «Fabergé» owing to his unique capability to work with information, to get information, and to convert it into the most precious treasures.
1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.