Multiplying Matrices Using dgemm
Use dgemm to Multiply Matrices
/* C source code is found in dgemm_example.c */ #define min(x,y) (((x) < (y)) ? (x) : (y)) #include <stdio.h> #include <stdlib.h> #include "mkl.h" int main() { double *A, *B, *C; int m, n, k, i, j; double alpha, beta; printf ("\n This example computes real matrix C=alpha*A*B+beta*C using \n" " Intel(R) MKL function dgemm, where A, B, and C are matrices and \n" " alpha and beta are double precision scalars\n\n"); m = 2000, k = 200, n = 1000; printf (" Initializing data for matrix multiplication C=A*B for matrix \n" " A(%ix%i) and matrix B(%ix%i)\n\n", m, k, k, n); alpha = 1.0; beta = 0.0; printf (" Allocating memory for matrices aligned on 64byte boundary for better \n" " performance \n\n"); A = (double *)mkl_malloc( m*k*sizeof( double ), 64 ); B = (double *)mkl_malloc( k*n*sizeof( double ), 64 ); C = (double *)mkl_malloc( m*n*sizeof( double ), 64 ); if (A == NULL  B == NULL  C == NULL) { printf( "\n ERROR: Can't allocate memory for matrices. Aborting... \n\n"); mkl_free(A); mkl_free(B); mkl_free(C); return 1; } printf (" Intializing matrix data \n\n"); for (i = 0; i < (m*k); i++) { A[i] = (double)(i+1); } for (i = 0; i < (k*n); i++) { B[i] = (double)(i1); } for (i = 0; i < (m*n); i++) { C[i] = 0.0; } printf (" Computing matrix product using Intel(R) MKL dgemm function via CBLAS interface \n\n"); cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, k, alpha, A, k, B, n, beta, C, n); printf ("\n Computations completed.\n\n"); printf (" Top left corner of matrix A: \n"); for (i=0; i<min(m,6); i++) { for (j=0; j<min(k,6); j++) { printf ("%12.0f", A[j+i*k]); } printf ("\n"); } printf ("\n Top left corner of matrix B: \n"); for (i=0; i<min(k,6); i++) { for (j=0; j<min(n,6); j++) { printf ("%12.0f", B[j+i*n]); } printf ("\n"); } printf ("\n Top left corner of matrix C: \n"); for (i=0; i<min(m,6); i++) { for (j=0; j<min(n,6); j++) { printf ("%12.5G", C[j+i*n]); } printf ("\n"); } printf ("\n Deallocating memory \n\n"); mkl_free(A); mkl_free(B); mkl_free(C); printf (" Example completed. \n\n"); return 0; }
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, k, alpha, A, k, B, n, beta, C, n);
 CblasRowMajor
 Indicates that the matrices are stored in row major order, with the elements of each row of the matrix stored contiguously as shown in the figure above.
 CblasNoTrans
 Enumeration typeindicating that the matricesAandBshould not be transposed or conjugate transposed before multiplication.
 m, n, k
 Integers indicating the size of the matrices:
 A:mrows bykcolumns
 B:krows byncolumns
 C:mrows byncolumns
 alpha
 Real value used to scale the product of matricesAandB.
 A
 Array used to store matrixA.
 k
 Leading dimension of arrayA, or the number of elements between successiverows (for row major storage)in memory. In the case of this exercise the leading dimension is the same as the number ofcolumns.
 B
 Array used to store matrixB.
 n
 Leading dimension of arrayB, or the number of elements between successiverows (for row major storage)in memory. In the case of this exercise the leading dimension is the same as the number ofcolumns.
 beta
 Real value used to scale matrixC.
 C
 Array used to store matrixC.
 n
 Leading dimension of arrayC, or the number of elements between successiverows (for row major storage)in memory. In the case of this exercise the leading dimension is the same as the number ofcolumns.
Compile and Link Your Code
 Windows* OS:icl /Qmkl src\dgemm_example.c
 Linux* OS, macOS*:icc mkl src/dgemm_example.c
 Windows* OS:build build run_dgemm_example
 Linux* OS, macOS*:make make run_dgemm_example
Example

Executable


dgemm_example
.c 
run_dgemm_example 
dgemm_with_timing
.c 
run_dgemm_with_timing 
matrix_multiplication
.c 
run_matrix_multiplication 
dgemm_threading_effect_example
.c 
run_dgemm_threading_effect_example 
Optimization Notice


Intel's compilers may or may not optimize to the same degree
for nonIntel microprocessors for optimizations that are not unique to Intel
microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction
sets and other optimizations. Intel does not guarantee the availability,
functionality, or effectiveness of any optimization on microprocessors not
manufactured by Intel. Microprocessordependent optimizations in this product
are intended for use with Intel microprocessors. Certain optimizations not
specific to Intel microarchitecture are reserved for Intel microprocessors.
Please refer to the applicable product User and Reference Guides for more
information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
