You can take advantage of Intel® Math Kernel Library (Intel® MKL), if you are used to uBLAS, by performing BLAS matrix-matrix multiplication in C++ using Intel MKL substitution of Boost uBLAS functions.
uBLAS pertains to the Boost C++ open-source libraries and provides BLAS functionality for dense, packed, and sparse matrices. The library uses an expression template technique for passing expressions as function arguments, which enables evaluating vector and matrix expressions in one pass without temporary matrices. uBLAS provides two modes:
- Debug (safe) mode, default.
Type and conformance checking is performed.
- Release (fast) mode.
Enabled by the NDEBUG preprocessor symbol.
The documentation for the Boost uBLAS is available at www.boost.org.
Example in this KB article demonstrates how to overload prod() function for substituting uBLAS dense matrix-matrix multiplication with the Intel MKL gemm calls. Though these functions break uBLAS expression templates and introduce temporary matrices, the performance advantage can be considerable for matrix sizes that are not too small (roughly, over 50).
You do not need to change your source code to use the functions. To call them:
- Include the header file mkl_boost_ublas_matrix_prod.hpp in your code (from the attached mkl_and_boost_examples zip file).
- Add appropriate Intel MKL libraries to the link line. Please refer the Intel MKL link line advisor online tool to select the appropriate libraries.
Only the following expressions are substituted:
prod( m1, m2 )
prod( trans(m1), m2 )
prod( trans(conj(m1)), m2 )
prod( conj(trans(m1)), m2 )
prod( m1, trans(m2) )
prod( trans(m1), trans(m2) )
prod( trans(conj(m1)), trans(m2) )
prod( conj(trans(m1)), trans(m2) )
prod( m1, trans(conj(m2)) )
prod( trans(m1), trans(conj(m2)) )
prod( trans(conj(m1)), trans(conj(m2)) )
prod( conj(trans(m1)), trans(conj(m2)) )
prod( m1, conj(trans(m2)) )
prod( trans(m1), conj(trans(m2)) )
prod( trans(conj(m1)), conj(trans(m2)) )
prod( conj(trans(m1)), conj(trans(m2)) )
These expressions are substituted in the release mode only (with NDEBUG preprocessor symbol defined). Supported uBLAS versions are Boost 1.34.1, 1.35.0, 1.36.0, and 1.37.0
A code example provided in the attached zip file ublas/ource/sylvester.cpp file illustrates usage of the Intel MKL uBLAS header file for solving a special case of the Sylvester equation.
To run the Intel MKL ublas examples, specify the BOOST_ROOT parameter in the make command, for instance, when using Boost version 1.37.0:
make lib32 BOOST_ROOT=<your_path>/boost_1_37_0
Revised Sample Note in 2015:
We revised the sample for boost 1.57.0 because boost 1.57.0 change the function OneElement definition from 1.0 to 0.0, which caused the original sample give wrong iteration result. For developers who are working on latest boost version like 1.57, please use the revised sample.
The samples was verified by boost 1.56.0 and 1.57. 0
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804