Matrix multiplication in C++ using Intel® Parallel Studio XE 2015 Composer Edition