Developer Reference

  • 0.9
  • 09/09/2020
  • Public Content

Computing Cluster FFT

Intel® oneAPI Math Kernel Library
cluster FFT functions are provided with Fortran and C interfaces.
Cluster FFT computation is performed by
functions, called in a program using MPI, which will be referred to as MPI program. After an MPI program starts, a number of processes are created. MPI identifies each process by its rank. The processes are independent of one another and communicate via MPI. A function called in an MPI program is invoked in all the processes. Each process manipulates data according to its rank. Input or output data for a cluster FFT transform is a sequence of real or complex values. A cluster FFT computation function operates on the local part of the input data, that is, some part of the data to be operated in a particular process, as well as generates local part of the output data. While each process performs its part of computations, running in parallel and communicating through MPI, the processes perform the entire FFT computation. FFT computations using the
Intel® oneAPI Math Kernel Library
cluster FFT functions are typically effected by a number of steps listed below:
  1. Initiate MPI by calling
    (the function must be called prior to calling any FFT function and any MPI function).
  2. Allocate memory for the descriptor and create it by calling
  3. Specify one of several values of configuration parameters by one or more calls to
  4. Obtain values of configuration parameters needed to create local data arrays; the values are retrieved by calling
  5. Initialize the descriptor for the FFT computation by calling
  6. Create arrays for local parts of input and output data and fill the local part of input data with values. (For more information, see Distributing Data among Processes.)
  7. Compute the transform by calling
  8. Gather local output data into the global array using MPI functions. (This step is optional because you may need to immediately employ the data differently.)
  9. Release memory allocated for the descriptor by calling
  10. Finalize communication through MPI by calling
    (the function must be called after the last call to a cluster FFT function and the last call to an MPI function).
Several code examples in Examples for Cluster FFT Functions in the Code Examples appendix illustrate cluster FFT computations.

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804