Contents

# BLAS-like Extensions

Intel® oneAPI
Math Kernel Library
provides C and Fortran routines to extend the functionality of the BLAS routines. These include routines to compute vector products, matrix-vector products, and matrix-matrix products.
Intel® oneAPI
Math Kernel Library
also provides routines to perform certain data manipulation, including matrix in-place and out-of-place transposition operations combined with simple matrix arithmetic operations. Transposition operations are Copy As Is, Conjugate transpose, Transpose, and Conjugate. Each routine adds the possibility of scaling during the transposition operation by giving some
alpha
and/or
beta
parameters. Each routine supports both row-major orderings and column-major orderings.
Table
“BLAS-like Extensions”
lists these routines.
The
<
?
>
symbol in the routine short names is a precision prefix that indicates the data type:
s
float
d
double
c
MKL_Complex8
z
MKL_Complex16
BLAS-like Extensions
Routine
Data Types
Description
s, d, c, z
Scales two vectors, adds them to one another and stores result in the vector (routines).
s, d, c, z
Computes groups of vector-scalar products added to a vector.
s, d, c, z
Computes groups of diagonal matrix-general matrix product
s, d
Allocates storage for a packed matrix.
s, d, c, z
Computes
scalar-matrix-matrix products and adds the results to scalar matrix products for groups of
general matrices.
bfloat16
Computes a matrix-matrix product with general matrices of bfloat16 data type.
bfloat16
Computes a matrix-matrix product with general matrices of bfloat16 data type where one or both input matrices are stored in a packed data structure and adds the result to a scalar-matrix product.
Integer
Computes a matrix-matrix product with general integer matrices.
s, d
Computes a matrix-matrix product with general matrices where one or both input matrices are stored in a packed data structure and adds the result to a scalar-matrix product.
Integer
Computes a matrix-matrix product with general integer matrices where one or both input matrices are stored in a packed data structure and adds the result to a scalar-matrix product.
s, d
Performs scaling and packing of the matrix into the previously allocated buffer.
Integer,
bfloat16
Pack the matrix into the buffer allocated previously.
s, d
Returns the number of bytes required to store the packed matrix.
Integer,
bfloat16
Returns the number of bytes required to store the packed matrix.
c, z
Computes a scalar-matrix-matrix product using matrix multiplications and adds the result to a scalar-matrix product.
c, z
Computes a scalar-matrix-matrix product using matrix multiplications and adds the result to a scalar-matrix product.
s, d, c, z
Computes a matrix-matrix product with general matrices but updates only the upper or lower triangular part of the result matrix.
s, d, c, z
Computes groups of matrix-vector product using general matrices.
s, d, c, z
Solves a triangular matrix equation for a group of matrices.
s, d, c, z
Performs scaling and in-place transposition/copying of matrices.
s, d, c, z
Performs scaling and sum of two matrices including their out-of-place transposition/copying.
s, d, c, z
Performs scaling and out-of-place transposition/copying of matrices.
s, d, c, z
Performs two-strided scaling and out-of-place transposition/copying of matrices.
s, d, c, z
Creates a handle on a jitter and generates a GEMM kernel that computes a scalar-matrix-matrix product and adds the result to a scalar-matrix product, with general matrices.

Deletes the previously created jitter and the generated GEMM kernel.
s, d, c, z
Returns the GEMM kernel previously generated.

#### Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.