Developer Reference

Contents

Computing Cluster FFT

The
Intel® MKL
cluster FFT functions are provided with Fortran and C interfaces.
Cluster FFT computation is performed by
DftiComputeForwardDM
and
DftiComputeBackwardDM
functions, called in a program using MPI, which will be referred to as MPI program. After an MPI program starts, a number of processes are created. MPI identifies each process by its rank. The processes are independent of one another and communicate via MPI. A function called in an MPI program is invoked in all the processes. Each process manipulates data according to its rank. Input or output data for a cluster FFT transform is a sequence of real or complex values. A cluster FFT computation function operates on the local part of the input data, that is, some part of the data to be operated in a particular process, as well as generates local part of the output data. While each process performs its part of computations, running in parallel and communicating through MPI, the processes perform the entire FFT computation. FFT computations using the
Intel® MKL
cluster FFT functions are typically effected by a number of steps listed below:
  1. Initiate MPI by calling
    MPI_Init
    (the function must be called prior to calling any FFT function and any MPI function).
  2. Allocate memory for the descriptor and create it by calling
    DftiCreateDescriptorDM
    .
  3. Specify one of several values of configuration parameters by one or more calls to
    DftiSetValueDM
    .
  4. Obtain values of configuration parameters needed to create local data arrays; the values are retrieved by calling
    DftiGetValueDM
    .
  5. Initialize the descriptor for the FFT computation by calling
    DftiCommitDescriptorDM
    .
  6. Create arrays for local parts of input and output data and fill the local part of input data with values. (For more information, see Distributing Data among Processes .)
  7. Compute the transform by calling
    DftiComputeForwardDM
    or
    DftiComputeBackwardDM
    .
  8. Gather local output data into the global array using MPI functions. (This step is optional because you may need to immediately employ the data differently.)
  9. Release memory allocated for the descriptor by calling
    DftiFreeDescriptorDM
    .
  10. Finalize communication through MPI by calling
    MPI_Finalize
    (the function must be called after the last call to a cluster FFT function and the last call to an MPI function).
Several code examples in Examples for Cluster FFT Functions in the Code Examples appendix illustrate cluster FFT computations.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804