Learning Lab

Introducing Batch GEMM Operations

The general matrix-matrix multiplication (GEMM) is a fundamental operation in most scientific, engineering, and data applications. There is an everlasting desire to make this operation run faster. Optimized numerical libraries like Intel® Math Kernel Library (Intel® MKL) typically offer parallel high-performing GEMM implementations to leverage the concurrent threads supported by modern multi-core architectures. This strategy works well when multiplying large matrices because all cores are used efficiently.

  • Developers
  • Partners
  • Professors
  • Students
  • Apple OS X*
  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 10
  • Microsoft Windows* 8.x
  • Unix*
  • Windows*
  • C/C++
  • Fortran
  • Advanced
  • Beginner
  • Intermediate
  • Intel® Math Kernel Library
  • Intel Math Kernal Library (Intel MKL)
  • Development Tools
  • Optimization
  • Parallel Computing
  • Intel® Parallel Studio XE 2015 Update 3 Cluster Edition Readme

    The Intel® Parallel Studio XE 2015 Update 3 Cluster Edition for Linux* and Windows* combines all Intel® Parallel Studio XE and Intel® Cluster Tools into a single package. This multi-component software toolkit contains the core libraries and tools to efficiently develop, optimize, run, and distribute parallel applications for clusters with Intel processors.  This package is for cluster users who develop on and build for IA-32 and Intel® 64 architectures on Linux* and Windows*, as well as customers running over the Intel® Xeon Phi™ coprocessor on Linux*. It contains:

  • Developers
  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8.x
  • Server
  • C/C++
  • Fortran
  • Intel® Parallel Studio XE Cluster Edition
  • Message Passing Interface
  • Cluster Computing
  • Intel Software Tools Technical Webinar Series

    These free technical webinars cover tips and techniques that will help sharpen your development skills to create faster, more reliable applications. Technical experts will cover topics ranging from vectorization, code migration, code optimization, using advanced threading techniques (e.g., OpenMP 4.0, Intel® Cilk™ Plus, Intel® TBB), and error checking. Bring programming questions to the live session for our technical experts to answer. A replay of each webinar will be available shortly after the live session so you can share with those unable to attend the live session.

  • Developers
  • C/C++
  • Fortran
  • Intel® Parallel Studio XE
  • Intel® VTune™ Amplifier XE
  • Intel® Inspector XE
  • Intel® Advisor XE
  • Intel® Cilk™ Plus
  • Intel® MPI Library
  • Intel® Threading Building Blocks
  • Intel® Cluster Studio XE
  • Intel® Fortran Studio XE
  • Webinars
  • Intel® Many Integrated Core Architecture
  • Subscribe to Learning Lab