| June 23, 2009 11:30 AM PDT | |
The following BLAS level 1 functions (vector-vector operations) for Intel® 64 architectures have been threaded from MKL 10.2 onwards
(CS,ZD,S,D)ROT
(C,Z,S,D)COPY
(C,Z,S,D)SWAP
• Performance improvement by up to 1.7-4.7 times over previous version on 4-core Intel® Core™ i7 processor depending on data location in cache.
• Performance improvement by up to 14-130 times over previous version on 24-core Intel® Xeon® processor 7400 series system, depending on data location in cache.
The following BLAS level 2 functions (matrix-vector operations) for Intel® 64 architectures have been threaded from MKL 10.2 onwards
(C,Z,S,D)TRMV
(S,D)SYMV
(S,D)SYR
(S,D)SYR2
• Performance improvement by up to 1.9-2.9 times over previous version on 4-core Intel® Core™ i7 processor, depending on data location in cache.
• Performance improvement by up to 16-40 times over previous version on 24-core Intel® Xeon® processor 7400 series system, depending on data location in cache.
This article applies to: Intel® Math Kernel Library Knowledge Base
For more complete information about compiler optimizations, see our Optimization Notice.
Comments (0) 
Trackbacks (0)
Leave a comment 
Vipin Kumar E K (Intel)
|

