I am moving a legacy code from Linux to Windows that uses FFTW 2.1.5 and so I have created and successfully linked to MKL's FFTW wrappers. My question however is about some of the wrappers functionality with respect to a 3 dimensional FFT, specifically the wrapper function fftwnd_mpi_local_sizes(). Show below is the original FFTW output and the MKL wrapper output.
int *local_nx -> int *CDFT_LOCAL_NX,
int *local_x_start -> int *CDFT_LOCAL_X_START,
int *local_ny_after_transpose -> int *CDFT_LOCAL_OUT_NX,
int *local_y_start_after_transpose -> int *CDFT_LOCAL_OUT_X_START
int *total_local_size -> int *CDFT_LOCAL_SIZE)
Local_ny_after_transpose and local_y_start_after_transpose are not being set to the information that is expected in the original FFTW implementation. Our layout and data allocation for the mpi processes heavily rely on the original output. After looking over the MKL documentation it appears that this is all MKL's FFT can give, unfortunately the Y values are critical.
An example of the problem is if I have a 36 by 16 by 14 X,Y,Z transform over 2 processors, FFTW output is expected to be processor_1(plan,18,0,8,0,4032) processor_2(plan,18,18,8,8,4032) but MKL will output processor_1(plan,18,0,18,0,4032) processor_2(plan,18,18,18,18,4032). This example may be predictable but the sizes of X,Y,Z are arbitrary and so is the number of processors so it no longer becomes very predictable. Are there any solutions to this problem?
-Thank you all,