Developer Reference

  • 0.9
  • 09/09/2020
  • Public Content
Contents

BLAS-like Extensions

Intel® oneAPI
Math Kernel Library
provides C and Fortran routines to extend the functionality of the BLAS routines. These include routines to compute vector products, matrix-vector products, and matrix-matrix products.
Intel® oneAPI
Math Kernel Library
also provides routines to perform certain data manipulation, including matrix in-place and out-of-place transposition operations combined with simple matrix arithmetic operations. Transposition operations are Copy As Is, Conjugate transpose, Transpose, and Conjugate. Each routine adds the possibility of scaling during the transposition operation by giving some
alpha
and/or
beta
parameters. Each routine supports both row-major orderings and column-major orderings.
Table
“BLAS-like Extensions”
lists these routines.
The
<
?
>
symbol in the routine short names is a precision prefix that indicates the data type:
s
float
d
double
c
MKL_Complex8
z
MKL_Complex16
BLAS-like Extensions
Routine
Data Types
Description
s, d, c, z
Computes groups of vector-scalar products added to a vector.
s, d, c, z
Scales two vectors, adds them to one another and stores result in the vector (routines).
s, d, c, z
Computes a matrix-matrix product with general matrices but updates only the upper or lower triangular part of the result matrix.
c, z
Computes a scalar-matrix-matrix product using matrix multiplications and adds the result to a scalar-matrix product.
s, d, c, z
Computes
scalar-matrix-matrix products and adds the results to scalar matrix products for groups of
general matrices.
c, z
Computes a scalar-matrix-matrix product using matrix multiplications and adds the result to a scalar-matrix product.
s, d, c, z
Solves a triangular matrix equation for a group of matrices.
s, d, c, z
Performs scaling and in-place transposition/copying of matrices.
s, d, c, z
Performs scaling and out-of-place transposition/copying of matrices.
s, d, c, z
Performs two-strided scaling and out-of-place transposition/copying of matrices.
s, d, c, z
Performs scaling and sum of two matrices including their out-of-place transposition/copying.
s, d
Returns the number of bytes required to store the packed matrix.
Integer,
bfloat16
Returns the number of bytes required to store the packed matrix.
s, d
Performs scaling and packing of the matrix into the previously allocated buffer.
Integer,
bfloat16
Pack the matrix into the buffer allocated previously.
s, d
Computes a matrix-matrix product with general matrices where one or both input matrices are stored in a packed data structure and adds the result to a scalar-matrix product.
integer
Computes a matrix-matrix product with general integer matrices where one or both input matrices are stored in a packed data structure and adds the result to a scalar-matrix product.
bfloat16
Computes a matrix-matrix product with general matrices of bfloat16 data type where one or both input matrices are stored in a packed data structure and adds the result to a scalar-matrix product.
bfloat16
Computes a matrix-matrix product with general matrices of bfloat16 data type.
integer
Computes a matrix-matrix product with general integer matrices.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804