Offloading MKL C FFT functions from Java

Offloading MKL C FFT functions from Java


Can someone point me to any examples of offloading C functions from java to MIC? I am trying to offload MKL DFTI functions from java. Any help or resources would be welcome. 

Just to clarify, I am able to use MKL from java; however, I am in need of direction when it comes to offloading to the MIC. 




4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.


Is there an automatic offload version of the MKL function you are using? If so, then you might be able to have MKL automatically do the offloading depending upon conditions on the host.


Best Reply

Hi Armin,

Based on some information given by a developer here at Intel, here is what I get:

  1. Java Native Interface (JNI) works only with Automatic Offload (AO) mode, but it doesn’t work with Compiler Assisted Offload (CAO) mode for now.

  2. With AO mode, offload to MIC is straight forward.

  3. There are some useful articles that may help you in writing your java application which canoffload functions to coprocessors:

  4. How do I use Intel MKL with Java .   This article shows an example on how to call cblas_dgemm from Java.

  5. Intel MKL Automatic Offload Enabled functions for Intel Xeon Phi coprocessors . These are the functions in MKL AO enabled for now and the list may grow in the upcoming versions. You should keep in mind that, for each function, there is a threshold in data size that can give advantage in AO that is being set from our observations of data transfer overhead vs. computation gain on coprocessor.  This size may be a bit higher, when we go through the JNI overhead too.

  6. Automatic Offload Controls . This table lists the useful environment variables for Automatic Offload.

  7. How to control the work division in Intel MKL on Intel Xeon Phi . For coprocessor, you need to enable MIC using MKL_MIC_ENABLE and can set some work division controls if required (optional) else MKL will automatically adjust the work division.

For your information, in the short future, we will have an article and a sample on offload MKL calls on a Java program, using JNI (with Automatic Offload mode).

It usually does not make sense to offload FFTs --- the time required to transfer the data to and from the Xeon Phi Coprocessor is usually comparable to the time required to perform the FFT on the host.

Once data is on the Xeon Phi FFTs can be very fast, but you need to keep the data on the Coprocessor so that you can perform multiple computations to amortize the cost of the data transfers.   Automatic offload mode is not well suited to controlling persistence of data on the Coprocessor, while manual offloading provides good controls for managing data transfers.

"Dr. Bandwidth"

Leave a Comment

Please sign in to add a comment. Not a member? Join today