General Matrix Multiply Sample
Features / Description
General Matrix Multiply (GEMM) sample demonstrates how to efficiently utilize an OpenCL* device to perform general matrix multiply operation on two dense square matrices. The primary target devices that are suitable for this sample are the devices with cache memory: Intel® Xeon Phi™ and Intel® Architecture CPU OpenCL devices.
The sample also:
- optimizes trivial matrix multiplication nested loop to utilize the memory cache more efficiently
- supports single-precision and double-precision data types
- demonstrates how to use different storage methods for matrices
- demonstrates how to utilize the automatic vectorizer efficiently and avoid gathers
Supported Devices: CPU, Intel(R) Xeon Phi(tm) coprocessor
Complexity Level: Intermediate
Refer to the Release Notes document for information on system requirements.
* OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.
This software is subject to the U.S. Export Administration Regulations and other U.S. law, and may not be exported or re-exported to certain countries (Burma, Cuba, Iran, Libya, North Korea, Sudan, and Syria) or to persons or entities prohibited from receiving U.S. exports (including Denied Parties, Specially Designated Nationals, and entities on the Bureau of Export Administration Entity List or involved with missile technology or nuclear, chemical or biological weapons).