Developer Reference for Intel® oneAPI Math Kernel Library for C

ID 766684
Date 11/07/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Computing Cluster FFT

The Intel® oneAPI Math Kernel Library (oneMKL)cluster FFT functions are provided with Fortran and C interfaces.

Cluster FFT computation is performed by DftiComputeForwardDM and DftiComputeBackwardDM functions, called in a program using MPI, which will be referred to as MPI program. After an MPI program starts, a number of processes are created. MPI identifies each process by its rank. The processes are independent of one another and communicate via MPI. A function called in an MPI program is invoked in all the processes. Each process manipulates data according to its rank. Input or output data for a cluster FFT transform is a sequence of real or complex values. A cluster FFT computation function operates on the local part of the input data, that is, some part of the data to be operated in a particular process, as well as generates local part of the output data. While each process performs its part of computations, running in parallel and communicating through MPI, the processes perform the entire FFT computation. FFT computations using the Intel® oneAPI Math Kernel Library (oneMKL) cluster FFT functions are typically effected by a number of steps listed below:

  1. Initiate MPI by calling MPI_Init (the function must be called prior to calling any FFT function and any MPI function).
  2. Allocate memory for the descriptor and create it by calling DftiCreateDescriptorDM.
  3. Specify one of several values of configuration parameters by one or more calls to DftiSetValueDM.
  4. Obtain values of configuration parameters needed to create local data arrays; the values are retrieved by calling DftiGetValueDM.
  5. Initialize the descriptor for the FFT computation by calling DftiCommitDescriptorDM.
  6. Create arrays for local parts of input and output data and fill the local part of input data with values. (For more information, see Distributing Data among Processes.)
  7. Compute the transform by calling DftiComputeForwardDM or DftiComputeBackwardDM.
  8. Gather local output data into the global array using MPI functions. (This step is optional because you may need to immediately employ the data differently.)
  9. Release memory allocated for the descriptor by calling DftiFreeDescriptorDM.
  10. Finalize communication through MPI by calling MPI_Finalize (the function must be called after the last call to a cluster FFT function and the last call to an MPI function).

Several code examples in Examples for Cluster FFT Functions in the Code Examples appendix illustrate cluster FFT computations.